Leveraging Bioinformatics and Machine Learning for Identifying Prognostic Biomarkers and Predicting Clinical Outcomes in Lung Adenocarcinoma.
Journal:
Genes
PMID:
39766765
Abstract
There exist significant challenges for lung adenocarcinoma (LUAD) due to its poor prognosis and limited treatment options, particularly in the advanced stages. It is crucial to identify genetic biomarkers for improving outcome predictions and guiding personalized therapies. In this study, we utilize a multi-step approach that combines principled sure independence screening, penalized regression methods and information gain to identify the key genetic features of the ultra-high dimensional RNA-sequencing data from LUAD patients. We then evaluate three methods of survival analysis: the Cox model, survival tree, and random survival forests (RSFs), to compare their predictive performance. Additionally, a protein-protein interaction network is used to explore the biological significance of identified genes. and are consistently selected as significant predictors across all feature selection methods. The Kaplan-Meier method shows that high expression levels of these genes are strongly correlated with poorer survival outcomes, suggesting their potential as prognostic biomarkers. RSF outperforms Cox and survival tree methods, showing higher AUC and C-index values. The protein-protein interaction network highlights key nodes such as and , which play central roles in LUAD progression. Our findings provide valuable insights into the genetic mechanisms of LUAD. These results contribute to the development of more accurate prognostic tools and personalized treatment strategies for LUAD.