Development and Validation of Time-to-Event Machine Learning Models for Predicting Disease-Free Survival in Patients with Locally Advanced Colorectal Cancer: A Multicenter Cohort Study.
Journal:
Annals of surgical oncology
Published Date:
Dec 2, 2025
Abstract
BACKGROUND: The postoperative prognosis of locally advanced colorectal cancer (LACRC) exhibits significant heterogeneity. However, conventional models for predicting disease-free survival (DFS) often lack the necessary precision. Therefore, we aim to develop and validate time-to-event machine learning (ML) models for predicting DFS in patients with LACRC, ultimately improving prognostic accuracy. PATIENTS AND METHODS: This multicenter cohort study enrolled 456 patients with LACRC from three medical centers. A training cohort consisting of 350 patients was formed from centers 1 and 2, while an external validation cohort comprising 106 patients was sourced from center 3. Preoperative computed tomography (CT) images were segmented to extract radiomics features, and a radiomics score (radscore) was calculated through feature engineering. In addition, intratumor heterogeneity (ITH) scores were derived by integrating clustered mask regions with global pixel distribution patterns. To predict DFS, five time-to-event ML models were trained: Cox proportional hazards, FastKernelSurvivalSVM, GradientBoostingSurvival (GB-Survival), RandomSurvivalForest, and ExtraSurvivalTrees. Model performance was assessed using the concordance index (C-index), and Survival SHapley Additive exPlanations over time (SurvSHAP (t)) analysis was conducted for model interpretation. RESULTS: Among the models tested, GB-Survival demonstrated the highest predictive performance for DFS, achieving a C-index of 0.7823. SurvSHAP (t) analysis revealed that the key prognostic factors included the ITH score, pathological TNM stage, lymphovascular invasion, radscore, and the prognostic nutritional index. CONCLUSIONS: The GB-Survival model that integrates multimodal data outperforms other time-to-event ML models in predicting DFS for LACRC. This approach may facilitate the development of data-driven treatment strategies and personalized risk stratification for patients with LACRC.
Authors
Keywords
No keywords available for this article.