A Unified Machine Learning Model for Relapse Prediction in Clinical Stage I Testicular Cancer.
Journal:
Andrology
Published Date:
Feb 19, 2026
Abstract
BACKGROUND: Approximately one-fourth of patients with clinical stage I testicular cancer relapse. For decades, risk stratification has been based on different tumor characteristics for seminomas and non-seminomas. Previous studies primarily used Cox proportional-hazards models and included only a limited number of variables. Machine learning techniques can integrate large datasets and may uncover novel combinations of risk factors. OBJECTIVES: To develop and validate a unified machine learning-based relapse prediction model for clinical stage I testicular cancer, regardless of histologic subtype, using nationwide histopathological and clinical data. MATERIALS AND METHODS: A population-based cohort study of 1377 patients diagnosed with clinical stage I testicular cancer in Denmark from 2013 to 2018. Histopathological and clinical data were obtained through centralized pathology assessment and systematic medical record review. Two tree-based binary classifiers (CatBoost and LightGBM) were trained to predict relapse, and a random survival forest model was used to estimate time-to-relapse. Data were split into training (80%, 5-fold cross-validation) and a test set (20%), balanced by seminoma/non-seminoma subtypes and outcome. Subgroup analyses were performed for seminoma and non-seminoma. Binary models were evaluated using receiver operating characteristic area under the curve, precision-recall area under the curve, and Matthew's correlation coefficient; random survival forest performance was assessed using concordance index and Integrated Brier Score. RESULTS: CatBoost outperformed LightGBM (receiver operating characteristic area under the curve = 0.74) and demonstrated a high negative predictive value (0.86). The random survival forest achieved a concordance index of 0.71. Predictive performance was stronger in the non-seminoma than in the seminoma subgroups. Top-ranked predictive features included lymphovascular invasion, embryonal carcinoma, tumor necrosis, rete testis invasion, tumor size, and elevated lactate dehydrogenase and β-human chorionic gonadotropin. Tumor necrosis and the anatomical location of lymphovascular invasion emerged as novel predictors. DISCUSSION AND CONCLUSION: A unified machine learning-based model for relapse prediction in clinical stage I testicular cancer is feasible and demonstrates moderate predictive accuracy. It is particularly useful for ruling out relapse and shows greater robustness in non-seminoma. These findings provide a framework for validation in independent cohorts and highlight key predictive features for future research.
Authors
Keywords
No keywords available for this article.