Interpretable machine learning models for survival prediction in prostate cancer bone metastases.

Journal: Scientific reports
Published Date:

Abstract

Prostate cancer bone metastasis (PCBM) is a highly lethal condition with limited survival. Accurate survival prediction is essential for managing these typically incurable patients. However, existing clinical models lack precision. This study seeks to establish machine learning models to improve survival predictions for PCBM patients. We extracted data for PCBM patients from the SEER database spanning 2010 to 2019. Prognostic features were identified through univariate and multivariate Cox regression analyses. To predict survival outcomes, we developed and validated XGBoost models with five-fold cross-validation. Model performance was assessed based on the area under the receiver operating characteristic curve (AUC) and overall accuracy. Feature importance was assessed using SHAP (SHapley Additive exPlanations) values, while decision curve analysis was conducted to determine the clinical applicability of the models. Additionally, Kaplan-Meier (K-M) analysis was employed to examine the impact of surgery, radiotherapy, and chemotherapy on the survival of PCBM patients. The XGBoost models achieved robust performance in predicting survival for PCBM patients, with AUC values of 0.76, 0.83, and 0.91 for 1-year, 3-year, and 5-year survival predictions, respectively, in the test set. Key prognostic factors included T stage, grade, age, PSA, and Gleason score. Single patients exhibited a significantly higher mortality risk than their married counterparts (HR = 1.23, 95% CI 1.19-1.27, p < 0.001). Conversely, a median household income exceeding $75,000 was associated with a notably reduced mortality risk (HR = 0.87, 95% CI 0.85-0.90, p < 0.001). Univariate Cox analysis showed that surgery, chemotherapy, and radiotherapy were all significantly associated with improved survival. However, multivariate Cox regression analysis indicated that only chemotherapy (HR = 0.85, 95% CI 0.81-0.89, p < 0.001) and radiotherapy (HR = 0.96, 95% CI 0.93-0.99, p = 0.032) remained significant, while surgery (HR = 0.98, 95% CI 0.93-1.03, p = 0.387) did not. SHAP summary and force plots were utilized to analyze the XGBoost model both on a global and local scale. Subsequently, a web-based tool was created to streamline the integration of this predictive model into clinical settings. Our study examined the clinical features of patients with PCBM and developed six machine learning models for prognosis, with the XGBoost model demonstrating the highest performance. The model's high accuracy and interpretability provide valuable support for developing personalized treatment plans for PCBM patients.

Authors

  • Hua Zhang
    School of Clinical Medicine, Hangzhou Medical College, Hangzhou, China.
  • Bingtian Dong
    Department of Ultrasound, the First Affiliated Hospital of Anhui Medical University, Hefei, China.
  • Jialin Han
    Artera, Inc, Los Altos, California.
  • Lewen Huang
    Department of Ultrasound, Affiliated Hangzhou First People's Hospital, School of Medicine, Westlake University, Huansha Road 261, Shangcheng District, Hangzhou, 310006, P. R. China.