Model development and validation for predicting small-cell lung cancer bone metastasis utilizing diverse machine learning algorithms based on the SEER database.

Journal: Medicine
PMID:

Abstract

The aim of this study was to devise a machine learning algorithm with superior performance in predicting bone metastasis (BM) in small cell lung cancer (SCLC) and create a straightforward web-based predictor based on the developed algorithm. Data comprising demographic and clinicopathological characteristics of patients with SCLC and their potential BM were extracted from the Surveillance, Epidemiology, and End Results database between 2010 and 2018. This data was then utilized to develop 12 machine learning algorithm models: support vector machine, logistic regression, NaiveBayes, extreme gradient boosting, decision tree, random forest, ExtraTrees, LightGBM, GradientBoosting, AdaBoost, MLP, and k-nearest neighbor. The models were compared and evaluated using various metrics, including accuracy, precision, recall rate, F1-score, the area under the receiver operating characteristic curve (AUC) value, and the Brier score. The objective was to predict the likelihood of BM in SCLC patients based on their demographic and clinicopathological features. The best-performing model was then chosen, and the associations between the clinicopathological characteristics and the target variable (presence or absence of BM) were interpreted based on this model. This analysis aimed to provide insights into the factors that may influence the risk of BM in SCLC patients. A total of 89,366 SCLC patients were included in this study, and among them, 8269 (9.25%) patients developed BM. The age, T stage, N stage, liver metastasis, lung metastasis, marital status, income, M stage, American Joint Committee on Cancer stage, and brain metastasis were identified as independent risk factors for SCLC. Among the various predictive models evaluated, the machine learning model utilizing the XGB algorithm showed the highest performance in both internal and external data validation, achieving AUC scores of training set AUC: 0.965, validation set AUC: 0.962, and testing set AUC: 0.961. Subsequently, the XGB algorithm was utilized to develop a web-based predictor for BM in patients with SCLC. This study has developed a web-based predictor utilizing the XGB algorithm to forecast the risk of BM in SCLC patients, aiming to provide doctors with valuable assistance in clinical decision-making.

Authors

  • Shuai Qie
    Department of Radiation Oncology, Affiliated Hospital of Hebei University, Baoding, Hebei Province, PR China.
  • Xin Zhang
    First Department of Infectious Diseases, The First Affiliated Hospital of China Medical University, Shenyang, China.
  • Jiusong Luan
  • Zhelun Song
  • Jingyun Li
    Department of AIDS Research, State Key Laboratory of Pathogen and Biosafety, Beijing Institute of Microbiology and Epidemiology, Beijing 100071, China.
  • Jingyu Wang
    Center of Medical & Health Analysis, School of Public Health, Peking University, Beijing, China.