Sparse coding of pathology slides compared to transfer learning with deep neural networks.

Journal: BMC bioinformatics
Published Date:

Abstract

BACKGROUND: Histopathology images of tumor biopsies present unique challenges for applying machine learning to the diagnosis and treatment of cancer. The pathology slides are high resolution, often exceeding 1GB, have non-uniform dimensions, and often contain multiple tissue slices of varying sizes surrounded by large empty regions. The locations of abnormal or cancerous cells, which may constitute a small portion of any given tissue sample, are not annotated. Cancer image datasets are also extremely imbalanced, with most slides being associated with relatively common cancers. Since deep representations trained on natural photographs are unlikely to be optimal for classifying pathology slide images, which have different spectral ranges and spatial structure, we here describe an approach for learning features and inferring representations of cancer pathology slides based on sparse coding.

Authors

  • Will Fischer
    Los Alamos National Laboratory, Los Alamos, NM, USA. wfischer@lanl.gov.
  • Sanketh S Moudgalya
    Chester F. Carlson Center for Imaging Science, Rochester Institute of Technology, Rochester, NY, USA.
  • Judith D Cohn
    Los Alamos National Laboratory, Los Alamos, NM, USA.
  • Nga T T Nguyen
    Los Alamos National Laboratory, Los Alamos, NM, USA.
  • Garrett T Kenyon
    Los Alamos National Laboratory, Los Alamos, NM, USA.