Development and external validation of interpretable machine learning models for personalized multiple treatment recommendations in non-small cell lung cancer.
Journal:
International journal of medical informatics
Published Date:
Oct 18, 2025
Abstract
OBJECTIVE: Treatment decision-making for non-small cell lung cancer (NSCLC) is complex, necessitating individualized decision-support tools to improve prognosis. This study aimed to develop and externally validate interpretable machine learning models to predict multiple treatment recommendations (surgery, radiotherapy and chemotherapy) and assess their prognostic implications. METHODS: We utilized data from 31,873 NSCLC patients (aged 45 and older, single primary lesion, 2010-2020) from the Surveillance, Epidemiology, and End Results (SEER) database to develop predictive models, and an additional 11,220 patients formed the external validation cohort. Individual models for surgery, radiotherapy and chemotherapy recommendations were developed and validated using nine machine learning algorithms. Model performance was evaluated using the area under the receiver operating characteristic curve (AUROC), accuracy, F1 score and Brier score. Shapley Additive exPlanations (SHAP) analysis was performed for model interpretability. Propensity score matching and Cox proportional hazards models were employed to quantify the survival benefits associated with each treatment modality. RESULTS: LightGBM demonstrated superior performance across all three treatment modalities, achieving AUROC values of 0.902 (surgery), 0.726 (radiotherapy), and 0.829 (chemotherapy) in the development cohort, with robust external validation AUROC values of 0.906, 0.721, and 0.801, respectively. Decision curve analyses revealed that all three models conferred higher net clinical benefits. SHAP analyses identified M stage, tumor stage, N stage and age as critical predictors. Survival analyses revealed significant overall survival benefits from surgery in early-stage patients, whereas radiotherapy and chemotherapy markedly improved outcomes in advanced-stage patients. CONCLUSION: The interpretable machine learning models accurately predicted individualized recommendations across multiple treatment modalities, providing valuable insights to support personalized decision-making and optimize patient outcomes for NSCLC patients.
Authors
Keywords
No keywords available for this article.