Image-based deep learning model using DNA methylation data predicts the origin of cancer of unknown primary.

Journal: Neoplasia (New York, N.Y.)
PMID:

Abstract

Cancer of unknown primary (CUP) is a rare type of metastatic cancer in which the origin of the tumor is unknown. Since the treatment strategy for patients with metastatic tumors depends on knowing the primary site, accurate identification of the origin site is important. Here, we developed an image-based deep-learning model that utilizes a vision transformer algorithm for predicting the origin of CUP. Using DNA methylation dataset of 8,233 primary tumors from The Cancer Genome Atlas (TCGA), we categorized 29 cancer types into 18 organ classes and extracted 2,312 differentially methylated CpG sites (DMCs) from non-squamous cancer group and 420 DMCs from squamous cell cancer group. Using these DMCs, we created organ-specific DNA methylation images and used them for model training and testing. Model performance was evaluated using 394 metastatic cancer samples from TCGA (TCGA-meta) and 995 samples (693 primary and 302 metastatic cancers) obtained from 20 independent external studies. We identified that the DNA methylation image reveals a distinct pattern based on the origin of cancer. Our model achieved an overall accuracy of 96.95 % in the TCGA-meta dataset. In the external validation datasets, our classifier achieved overall accuracies of 96.39 % and 94.37 % in primary and metastatic tumors, respectively. Especially, the overall accuracies for both primary and metastatic samples of non-squamous cell cancer were exceptionally high, with 96.79 % and 96.85 %, respectively.

Authors

  • Jinha Hwang
    Department of Laboratory Medicine, Korea University Anam Hospital, Seoul, the Republic of Korea.
  • Yeajina Lee
    Department of Biomedical Sciences, Seoul National University Graduate School, Seoul, the Republic of Korea; Genomic Medicine Institute, Medical Research Center, Seoul National University, Seoul, the Republic of Korea.
  • Seong-Keun Yoo
    Precision Immunology Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA; Department of Oncological Sciences, Tisch Cancer Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA; Department of Artificial Intelligence and Human Health, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA; Icahn Genomics Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA. Electronic address: seong-keun.yoo@mssm.edu.
  • Jong-Il Kim
    Department of Biomedical Sciences, Seoul National University Graduate School, Seoul, the Republic of Korea; Genomic Medicine Institute, Medical Research Center, Seoul National University, Seoul, the Republic of Korea. Electronic address: jongil@snu.ac.kr.