Comparative study of XGBoost and logistic regression for predicting sarcopenia in postsurgical gastric cancer patients.

Journal: Scientific reports
PMID:

Abstract

The use of machine learning (ML) techniques, particularly XGBoost and logistic regression, to predict sarcopenia among postsurgical gastric cancer patients has gained significant attention in recent research. Sarcopenia, characterized by the progressive loss of skeletal muscle mass and strength, is a serious concern in these patients due to its association with poor postoperative outcomes, including increased morbidity and mortality. In this study, machine learning was used to establish a risk prediction model for sarcopenia in patients with gastric cancer undergoing gastrectomy to facilitate early intervention and reduce the incidence of postoperative complications. Gastric cancer patients who underwent surgery at a tertiary comprehensive hospital in Nanjing (China) from January 2022 to December 2023 were retrospectively included in this study, and their clinical and follow-up data were collected. The XGBoost model and multivariate logistic regression analysis model were used to screen the factors related to postoperative outcomes, and the results of the two models were compared. The area under the receiver operating characteristic (ROC) curve (AUC), sensitivity and specificity were calculated to evaluate the predictive value of the XGBoost model. The SHAP (SHapley Additive exPlanations) method was used to explain the XGBoost model and determine the impact of features on the prediction model. A total of 231 postoperative gastric cancer patients were included in this study, of whom 128 (55.4%) developed sarcopenia. The results of the univariate analysis and LASSO (Least Absolute Shrinkage and Selection Operator) regression were cross-validated, and 5 key study variables were ultimately determined: serum albumin, comorbid diabetes, operation style, nutritional score, and ECOG (Eastern Cooperative Oncology Group) performance status score. The XGBoost model has slightly better AUC (0.987, 95% CI: 0.976-0.998) than the logistic regression model (0.918, 95% CI: 0.873-0.963) in the training set. The SHAP analysis showed that in the XGBoost model, diabetes, nutritional score, and serum albumin have a greater impact on the sarcopenia risk prediction after gastric cancer surgery, especially the impact of diabetes and nutritional score is the most significant, followed by the ECOG performance status score, and operation style has the least impact. In summary, the machine learning-based sarcopenia prediction model constructed in this study provides a valuable decision support tool for clinical screening and intervention of sarcopenia.

Authors

  • Yajing Gu
    Department of Urology, The Affiliated Jiangning Hospital of Nanjing Medical University, Nanjing, Jiangsu, China.
  • Shu Su
    Department of Urology, The Affiliated Jiangning Hospital of Nanjing Medical University, Nanjing, Jiangsu, China.
  • Xianping Wang
    Wound ostomy clinic, The Affiliated Jiangning Hospital of Nanjing Medical University, Nanjing, Jiangsu, China.
  • Juanjuan Mao
    Department of Thyroid and Breast Surgery, Ningbo Hospital of TCM Affiliated to Zhejiang Chinese Medicine University, Ningbo City, Zhejiang Province, China.
  • Xuan Ni
    Department of Orthopedics, The Affiliated Jiangning Hospital of Nanjing Medical University, Nanjing, Jiangsu, China.
  • Ai Li
    Nursing Department, The Affiliated Jiangning Hospital of Nanjing Medical University, Nanjing, Jiangsu, China.
  • Yueli Liang
    Department of general surgery, The Affiliated Jiangning Hospital of Nanjing Medical University, Nanjing, Jiangsu, China.
  • Xing Zeng
    Second School of Clinic Medicine, Guangzhou University of Chinese Medicine, Guangzhou, 510120, China. zengxing-china@163.com.