Artificial intelligence-aided clinical annotation of a large multi-cancer genomic dataset.

Journal: Nature communications
Published Date:

Abstract

To accelerate cancer research that correlates biomarkers with clinical endpoints, methods are needed to ascertain outcomes from electronic health records at scale. Here, we train deep natural language processing (NLP) models to extract outcomes for participants with any of 7 solid tumors in a precision oncology study. Outcomes are extracted from 305,151 imaging reports for 13,130 patients and 233,517 oncologist notes for 13,511 patients, including patients with 6 additional cancer types. NLP models recapitulate outcome annotation from these documents, including the presence of cancer, progression/worsening, response/improvement, and metastases, with excellent discrimination (AUROC > 0.90). Models generalize to cancers excluded from training and yield outcomes correlated with survival. Among patients receiving checkpoint inhibitors, we confirm that high tumor mutation burden is associated with superior progression-free survival ascertained using NLP. Here, we show that deep NLP can accelerate annotation of molecular cancer datasets with clinically meaningful endpoints to facilitate discovery.

Authors

  • Kenneth L Kehl
    Department of Medicine, Dana-Farber Cancer Institute, Boston, MA, 02215, United States.
  • Wenxin Xu
    From Dana-Farber Cancer Institute, Boston, MA, USA.
  • Alexander Gusev
    From Dana-Farber Cancer Institute, Boston, MA, USA.
  • Ziad Bakouny
    Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA.
  • Toni K Choueiri
    Lank Center for Genitourinary Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA.
  • Irbaz Bin Riaz
    Department of Medical Oncology Mayo Clinic Scottsdale Arizona USA.
  • Haitham Elmarakeby
    Department of Computer Science, Virginia Polytechnic Institute and State University Blacksburg, VA, USA.
  • Eliezer M Van Allen
    Department of Medical Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Harvard University, Boston, Massachusetts.
  • Deborah Schrag
    Memorial-Sloan Kettering Cancer Center, New York, USA.