Machine learning models for pancreatic cancer diagnosis based on microbiome markers from serum extracellular vesicles.

Journal: Scientific reports
PMID:

Abstract

Pancreatic cancer (PC) is a fatal disease with an extremely low 5-year survival rate, mainly because of its poor detection rate in early stages. Given emerging evidence of the relationship between microbiota composition and diseases, this study aims to identify microbiome markers linked to the diagnosis of pancreatic cancer. We utilized extracellular vesicles (EVs) data obtained from blood samples of 38 pancreatic cancer patients and 51 health controls. Least absolute shrinkage and selection operator (LASSO) and stepwise method were used to obtain some candidate markers in genus and phylum levels. These markers were used to develop various machine learning models including logistic regression (LR), random forest (RF), support vector machine (SVM), and Deep Neural Network (DNN) methods. In phylum level, DNN performed best with three markers (Verrucomicrobia, Actinobacteria and Proteobacteria) selected by stepwise method with the test AUC 0.959. In genus level, DNN using 11 markers selected by LASSO (Ruminococcaceae UCG-013, Ruminiclostridium, Propionibacterium, Lachnospiraceae NK4A136 group, Corynebacterium.1, Akkermansia, Mucispirillum, Pseudomonas, Diaphorobacter, Clostridium sensu stricto 1 and Turicibacter) outperformed others with 0.961 test AUCs. These results highlight the potential of microbiome markers and prediction models in clinical studies of PC diagnosis.

Authors

  • Doeun Lee
    Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, 08826, Korea.
  • Chanhee Lee
    Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Republic of Korea.
  • Kyulhee Han
    Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, 08826, Korea.
  • Taewan Goo
    Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Republic of Korea.
  • Boram Kim
    Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul 08826, Korea.
  • Youngmin Han
    Department of Surgery and Cancer Research Institute, Seoul National University College of Medicine, 101 Daehak-ro, Chongno-gu, Seoul, 03080, South Korea.
  • Wooil Kwon
    Department of Surgery and Cancer Research Institute, Seoul National University College of Medicine, 101 Daehak-ro, Chongno-gu, Seoul, 03080, South Korea.
  • Seungyeoun Lee
    Department of Mathematics & Statistics, Sejong University, Seoul, Republic of Korea.
  • Jin-Young Jang
  • Taesung Park
    Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Republic of Korea.