Development and validation of machine learning models for early diagnosis and prognosis of lung adenocarcinoma using miRNA expression profiles.
Journal:
Cancer biomarkers : section A of Disease markers
PMID:
40171815
Abstract
ObjectiveStudy aims to develop diagnostic and prognostic models for lung adenocarcinoma (LUAD) using Machine learning(ML)algorithms, aiming to enhance clinical decision-making accuracy.MethodsData from The Cancer Genome Atlas (TCGA) for LUAD patients were split into training (n = 196) and test sets (n = 133). Feature selection (Least Absolute Shrinkage and Selection Operator (LASSO), Random Forest (RF), and Support Vector Machine (SVM)) identified miRNAs distinguishing stage I LUAD. Six ML algorithms predicted pulmonary node classification. Model performance was evaluated using Receiver Operating Characteristic (ROC) curve, Precision-Recall (PR) curves, and Error Rates (CE). A prognostic model was constructed using Lasso Cox regression. Risk score plots were generated, and model performance was assessed using Kaplan-Meier (K-M) and time-dependent ROC curves. Functional enrichment analyses investigated miRNA function and mechanism.ResultsThe feature selection results identified five miRNA molecules as distinguishing characteristics between early-stage LUAD and adjacent non-cancerous tissues. A prognostic model using 13 miRNAs predicted poorer outcomes for patients with higher risk scores, supported by time-dependent ROC curves and a nomogram. Functional enrichment analysis identified cancer-related signaling pathways for the biomarkers.ConclusionML identified a diagnostic five-miRNA signature and a prognostic 13-miRNA model for LUAD, both robust and reliable.