Using machine learning to discover DNA metabolism biomarkers that direct prostate cancer treatment.
Journal:
Scientific reports
Published Date:
Jul 18, 2025
Abstract
DNA metabolism genes play pivotal roles in the regulation of cellular processes that contribute to cancer progression, immune modulation, and therapeutic response in prostate cancer (PC). Understanding the mechanisms by which these genes influence the tumor microenvironment and immune evasion is crucial for identifying prognostic biomarkers and developing targeted therapies. We performed an integrative analysis using transcriptomic data from the TCGA cohort and external validation datasets. Differentially expressed genes (DEGs) were identified using the edgeR algorithm with an FDR < 0.01 and a minimum fold change of 1.5. Gene enrichment analysis was conducted through GO and KEGG pathways to explore the biological significance of DNA metabolism genes in PC. In addition, clustering analyses, machine learning models, and single-cell RNA sequencing (scRNA-seq) were employed to investigate the immune characteristics, prognostic value, and therapeutic relevance of these genes. A total of 536 DEGs were identified across six subtypes of prostate cancer, with key DNA metabolism genes such as POLD2, RAD9A, REV3L, MSH6, and WRNIP1 highlighted as critical players. Gene enrichment analyses revealed that these DEGs were significantly associated with pathways involved in DNA repair, cellular aging, and telomere maintenance. Clustering analysis identified two distinct subgroups (C1 and C2) based on DNA metabolism gene expression, with C1 exhibiting a more aggressive phenotype, higher immune infiltration, and poorer prognosis. Machine learning models, particularly the CoxBoost algorithm, identified 21 key genes contributing to an effective prognostic model. Furthermore, scRNA-seq analysis confirmed the upregulation of DNA metabolism genes in PC cells compared to normal cells. Our findings highlight the importance of DNA metabolism genes in the progression and immune dynamics of PC. These genes not only serve as potential biomarkers for prognosis but also offer promising targets for personalized therapies. The integration of multi-omics data and advanced computational models provides new insights into the molecular underpinnings of PC and holds potential for improving treatment strategies.