Machine learning prediction of overall survival in prostate adenocarcinoma using ensemble techniques.
Journal:
Computers in biology and medicine
PMID:
40081210
Abstract
Prostate adenocarcinoma (PAC) is a complex and common cancer in males and is one of the leading causes of cancer-related death globally. PAC is a multifaceted disease that encompasses different subtypes, including acinar and ductal adenocarcinoma, small cell carcinoma, neuroendocrine tumors, and transitional cell carcinoma with each subtype presenting distinct prognostic difficulties. Therefore, predicting the overall survival (OS) rate of individuals with PAC continues to be a substantial clinical barrier due to the diverse nature of the illness, coexisting medical conditions, and constraints associated with conventional diagnostic markers. As a result, we focus on using ensemble machine learning (ML) models to predict the OS of PAC patients. We evaluated these eight (8) ensemble ML models: Random Forest (RF), AdaBoost, Gradient Boosting (GB), Extreme Gradient Boosting (XGB), LightGBM (LGBM), CatBoost, Hard Voting Classifier (HVC), and Support Vector Classifier (SVC), using the data set obtained from the Cancer Genome Atlas (TCGA) PanCancer Atlas. The ensemble ML models were evaluated using essential performance indicators, such as accuracy, precision, recall, F-1 score, and ROC AUC score. The results show that GB outperformed other models by obtaining a perfect score of 1.0 in accuracy, precision, recall, and F-1 score, and 0.99 as ROC AUC. Similarly, RF and AdaBoost exhibited robust efficiency, suggesting their potential in healthcare settings for predicting PAC survival. In conclusion, the study highlights the importance of ensemble techniques in improving prediction precision and underscores the need for further research in clinical settings.