Deep learning integrates histopathology and proteogenomics at a pan-cancer level.

Journal: Cell reports. Medicine
Published Date:

Abstract

We introduce a pioneering approach that integrates pathology imaging with transcriptomics and proteomics to identify predictive histology features associated with critical clinical outcomes in cancer. We utilize 2,755 H&E-stained histopathological slides from 657 patients across 6 cancer types from CPTAC. Our models effectively recapitulate distinctions readily made by human pathologists: tumor vs. normal (AUROC = 0.995) and tissue-of-origin (AUROC = 0.979). We further investigate predictive power on tasks not normally performed from H&E alone, including TP53 prediction and pathologic stage. Importantly, we describe predictive morphologies not previously utilized in a clinical setting. The incorporation of transcriptomics and proteomics identifies pathway-level signatures and cellular processes driving predictive histology features. Model generalizability and interpretability is confirmed using TCGA. We propose a classification system for these tasks, and suggest potential clinical applications for this integrated human and machine learning approach. A publicly available web-based platform implements these models.

Authors

  • Joshua M Wang
    Institute for Systems Genetics, NYU Grossman School of Medicine, New York, NY 10016, USA; Department of Biochemistry and Molecular Pharmacology, NYU Grossman School of Medicine, New York, NY 10016, USA.
  • Runyu Hong
    Institute for Systems Genetics, Grossman School of Medicine, New York University, New York, New York, USA.
  • Elizabeth G Demicco
    Department of Pathology and Laboratory Medicine, Mount Sinai Hospital and Laboratory Medicine and Pathobiology, University of Toronto, Toronto, ON, Canada. Electronic address: elizabeth.demicco@sinaihealth.ca.
  • Jimin Tan
    From the Courant Institute of Mathematical Sciences (K.L., K.C.) and Center for Data Science (B.Z., J.T., Y.S., K.J.G., K.C.), New York University, New York, NY; The Bernard and Irene Schwartz Center for Biomedical Imaging (K.J.G., J.S.B., C.M.D.) and Department of Radiology (K.J.G., J.S.B., G.C., C.M.D.), New York University Langone Health, 660 1st Ave, New York, NY 10016.
  • Rossana Lazcano
    Department of Pathology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA.
  • Andre L Moreira
    Department of Pathology, Memorial Sloan Kettering Cancer Center, New York, NY, USA.
  • Yize Li
    Department of Medicine, Washington University in St. Louis, St. Louis, MO 63110, USA; McDonnell Genome Institute, Washington University in St. Louis, St. Louis, MO 63108, USA.
  • Anna Calinawan
    Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA; Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA.
  • Narges Razavian
    1 Department of Computer Science, New York University , New York, New York.
  • Tobias Schraink
    Institute for Systems Genetics, NYU Grossman School of Medicine, New York, NY 10016, USA; Department of Biochemistry and Molecular Pharmacology, NYU Grossman School of Medicine, New York, NY 10016, USA; Division of Precision Medicine, Department of Medicine, NYU Grossman School of Medicine, New York, NY 10016, USA.
  • Michael A Gillette
    The Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; Massachusetts General Hospital Division of Pulmonary and Critical Care Medicine, Boston, MA 02114, USA; Harvard Medical School, Boston, MA 02115, USA.
  • Gilbert S Omenn
  • Eunkyung An
    Office of Cancer Clinical Proteomics Research, National Cancer Institute, Rockville, MD 20850, USA.
  • Henry Rodriguez
    Office of Cancer Clinical Proteomics Research, National Cancer Institute, Bethesda, MD 20892, USA.
  • Aristotelis Tsirigos
    Department of Pathology, NYU School of Medicine, New York, NY 10016, USA; Laura and Isaac Perlmutter Cancer Center, NYU School of Medicine, New York, NY 10016, USA; Applied Bioinformatics Laboratories, NYU School of Medicine, New York, NY 10016, USA. Electronic address: aristotelis.tsirigos@nyulangone.org.
  • Kelly V Ruggles
    Institute for Systems Genetics, NYU Grossman School of Medicine, New York, NY 10016, USA; Division of Precision Medicine, Department of Medicine, NYU Grossman School of Medicine, New York, NY 10016, USA.
  • Li Ding
    College of Chemistry and Food Engineering, Changsha University of Science and Technology, Changsha 410014, China.
  • Ana I Robles
    Office of Cancer Clinical Proteomics Research, National Cancer Institute, Rockville, MD 20850, USA.
  • D R Mani
    The Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA.
  • Karin D Rodland
    Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA, USA.
  • Alexander J Lazar
    The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA.
  • Wenke Liu
    Institute for Systems Genetics, NYU Grossman School of Medicine, New York, NY 10016, USA; Department of Biochemistry and Molecular Pharmacology, NYU Grossman School of Medicine, New York, NY 10016, USA. Electronic address: wenke.liu@nyulangone.org.
  • David Fenyö
    Department of Biochemistry and Molecular Pharmacology, Institute for Systems Genetics, New York University Grossman School of Medicine, New York, USA.