Development and validation of a machine learning model based on complete blood counts to predict clinical outcomes in urothelial carcinoma patients.

Journal: Clinica chimica acta; international journal of clinical chemistry
Published Date:

Abstract

Urothelial carcinoma (UC) is a highly malignant disease with significant public health implications. Despite advancements in oncology, early diagnosis and effective prognostic tools remain limited. This study aimed to develop a machine learning model using complete blood count (CBC) data to predict clinical outcomes in UC patients. A retrospective, two-center cohort study was conducted, analyzing 23 CBC variables from 477 UC patients at Xuhui Hospital of Fudan University (discovery cohort) and 297 UC patients from Putuo People's Hospital of Tongji University (validation cohort). CBC data were collected before treatment and three months posttreatment, with overall survival (OS) as the primary endpoint. Nine machine learning models were developed in the discovery cohort and validated independently. Feature selection identified a logistic regression (LR) model incorporating white blood cell (WBC) count and lymphocyte percentage (LYMPH%) as the optimal predictor. The model achieved high performance, with an area under the ROC curve (AUC) of 0.93 (95 %CI: 0.90-0.97), area under the precision-recall curve (AUPRC) of 0.94 (95 %CI: 0.89-0.99), positive predictive value (PPV) of 0.87 (95 %CI: 0.75-0.98), negative predictive value (NPV) of 0.82 (95 %CI: 0.78-0.87), accuracy of 0.83 (95 %CI: 0.80-0.88), and F1 score of 0.82 (95 %CI: 0.79-0.86) in the discovery cohort, and comparable results in the validation cohort (AUC 0.88 [95 %CI: 0.84-0.93], AUPRC 0.81 [95 %CI: 0.75-0.86], PPV 0.77 [95 %CI: 0.71-0.84], NPV 0.89 [95 %CI: 0.84-0.95], accuracy 0.84 [95 %CI: 0.80-0.89], and F1 score 0.80 [95 %CI: 0.74-0.87]). Decision curve analysis demonstrated consistent net benefits, while Kaplan-Meier analysis indicated significantly shorter OS in the "predict worse outcomes" subgroup. Posttreatment, WBC counts increased and LYMPH% decreased in deceased patients, whereas survivors showed the opposite trends (P < 0.05). These findings suggest that a simple, cost-effective CBC-based machine learning model can effectively predict UC prognosis, aiding clinical decision-making.

Authors

  • Jie Cheng
    State Key Laboratory of Animal Nutrition, College of Animal Science and Technology, China Agricultural University, Beijing, China.
  • Fei Chen
    Department of Electrical and Electronic Engineering, Southern University of Science and Technology, Shenzhen, China.
  • Yunxiao Song
    Department of Clinical Laboratory, Shanghai Xuhui Central Hospital, Fudan University, Shanghai, China. xzxsh@sina.com.
  • Siyang Wang
    Department of Geratology, Shanghai Xuhui Central Hospital/Xuhui Hospital, Fudan University, Shanghai 200031, China.
  • Jingying Jia
    Department of Central Laboratory, Shanghai Xuhui Central Hospital/Xuhui Hospital, Fudan University, Shanghai 200031, China.
  • Hang Wang
    Key Subject Laboratory of Nuclear Safety and Simulation Technology, Harbin Engineering University, Harbin, 150001, China. Electronic address: wanghang1990312@126.com.
  • Houbao Liu
    Zhongshan Hospital, Fudan University, Shanghai 200032, China. Electronic address: zsliuhb@sina.cn.