Machine learning analysis of DNA methylation profiles distinguishes primary lung squamous cell carcinomas from head and neck metastases.

Journal: Science translational medicine
Published Date:

Abstract

Head and neck squamous cell carcinoma (HNSC) patients are at risk of suffering from both pulmonary metastases or a second squamous cell carcinoma of the lung (LUSC). Differentiating pulmonary metastases from primary lung cancers is of high clinical importance, but not possible in most cases with current diagnostics. To address this, we performed DNA methylation profiling of primary tumors and trained three different machine learning methods to distinguish metastatic HNSC from primary LUSC. We developed an artificial neural network that correctly classified 96.4% of the cases in a validation cohort of 279 patients with HNSC and LUSC as well as normal lung controls, outperforming support vector machines (95.7%) and random forests (87.8%). Prediction accuracies of more than 99% were achieved for 92.1% (neural network), 90% (support vector machine), and 43% (random forest) of these cases by applying thresholds to the resulting probability scores and excluding samples with low confidence. As independent clinical validation of the approach, we analyzed a series of 51 patients with a history of HNSC and a second lung tumor, demonstrating the correct classifications based on clinicopathological properties. In summary, our approach may facilitate the reliable diagnostic differentiation of pulmonary metastases of HNSC from primary LUSC to guide therapeutic decisions.

Authors

  • Philipp Jurmeister
    Institute of Pathology, Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin and Berlin Institute of Health, 10117 Berlin, Germany.
  • Michael Bockmayr
    Institute of Pathology, Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin and Berlin Institute of Health, 10117 Berlin, Germany.
  • Philipp Seegerer
    Machine-Learning Group, Department of Software Engineering and Theoretical Computer Science, Technical University of Berlin, 10623 Berlin, Germany.
  • Teresa Bockmayr
    Institute of Pathology, Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin and Berlin Institute of Health, 10117 Berlin, Germany.
  • Denise Treue
    Institute of Pathology, Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin and Berlin Institute of Health, 10117 Berlin, Germany.
  • Grégoire Montavon
    Machine Learning Group, Technische Universität Berlin, Berlin, Germany.
  • Claudia Vollbrecht
    Institute of Pathology, Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin and Berlin Institute of Health, 10117 Berlin, Germany.
  • Alexander Arnold
    Institute of Pathology, Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin and Berlin Institute of Health, 10117 Berlin, Germany.
  • Daniel Teichmann
    Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Cambridge, MA, USA.
  • Keno Bressem
    Department of Radiology, Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin and Berlin Institute of Health, 10117 Berlin, Germany.
  • Ulrich Schüller
    Center for Neuropathology and Prion Research, Ludwig-Maximilians-University Munich, Munich, Germany; Institute of Neuropathology, University Medical Center, Hamburg-Eppendorf, Germany; Research Institute Children's Cancer Center, Hamburg, Germany; Department of Pediatric Hematology and Oncology, University Medical Center, Hamburg-Eppendorf, Germany.
  • Maximilian von Laffert
    Institute of Pathology, Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin and Berlin Institute of Health, 10117 Berlin, Germany.
  • Klaus-Robert Müller
    Berlin Institute for the Foundations of Learning and Data (BIFOLD), Berlin, Deutschland.
  • David Capper
    German Cancer Consortium (DKTK), Partner Site Berlin, and German Cancer Research Center (DKFZ), 69210 Heidelberg, Germany. frederick.klauschen@charite.de david.capper@charite.de.
  • Frederick Klauschen
    Pathologisches Institut, Ludwig-Maximilians-Universität München, Thalkirchner Str. 36, 80337, München, Deutschland. f.klauschen@lmu.de.