Classification of NSCLC subtypes using lung microbiome from resected tissue based on machine learning methods.

Journal: NPJ systems biology and applications
PMID:

Abstract

Classification of adenocarcinoma (AC) and squamous cell carcinoma (SCC) poses significant challenges for cytopathologists, often necessitating clinical tests and biopsies that delay treatment initiation. To address this, we developed a machine learning-based approach utilizing resected lung-tissue microbiome of AC and SCC patients for subtype classification. Differentially enriched taxa were identified using LEfSe, revealing ten potential microbial markers. Linear discriminant analysis (LDA) was subsequently applied to enhance inter-class separability. Next, benchmarking was performed across six different supervised-classification algorithms viz. logistic-regression, naïve-bayes, random-forest, extreme-gradient-boost (XGBoost), k-nearest neighbor, and deep neural network. Noteworthy, XGBoost, with an accuracy of 76.25%, and AUROC (area-under-receiver-operating-characteristic) of 0.81 with 69% specificity and 76% sensitivity, outperform the other five classification algorithms using LDA-transformed features. Validation on an independent dataset confirmed its robustness with an AUROC of 0.71, with minimal false positives and negatives. This study is the first to classify AC and SCC subtypes using lung-tissue microbiome.

Authors

  • Pragya Kashyap
    Department of Bioscience & Bioengineering, Indian Institute of Technology, Jodhpur, Rajasthan, India.
  • Kalbhavi Vadhi Raj
    Department of Electrical Engineering, Indian Institute of Technology, Jodhpur, Rajasthan, India.
  • Jyoti Sharma
    Manipal Academy of Higher Education (MAHE), Manipal, Karnataka, India.
  • Naveen Dutt
    All India Institute of Medical Sciences, Rajasthan, Jodhpur, India.
  • Pankaj Yadav
    1Institute of Medical Bioinformatics and Systems Medicine Medical Center, Faculty of Medicine, Albert-Ludwigs University of Freiburg, 79110Freiburg, Germany; 2Department of Computer Engineering, Zakir Husain College of Engineering and Technology, Aligarh Muslim University, Aligarh, Uttar Pradesh, India; 3Department of Bioscience and Bio- engineering, Indian Institute of Technology, Jodhpur, India; 4Institute of Bioinformatics, International Technology Park, Bangalore, 560066, India; 5Manipal Academy of Higher Education (MAHE), Manipal576104, Karnataka, India.