Development and validation of a machine learning model based on complete blood counts to predict clinical outcomes in urothelial carcinoma patients.

Journal: Clinica chimica acta; international journal of clinical chemistry

Published Date: May 15, 2025

Abstract

Urothelial carcinoma (UC) is a highly malignant disease with significant public health implications. Despite advancements in oncology, early diagnosis and effective prognostic tools remain limited. This study aimed to develop a machine learning model using complete blood count (CBC) data to predict clinical outcomes in UC patients. A retrospective, two-center cohort study was conducted, analyzing 23 CBC variables from 477 UC patients at Xuhui Hospital of Fudan University (discovery cohort) and 297 UC patients from Putuo People's Hospital of Tongji University (validation cohort). CBC data were collected before treatment and three months posttreatment, with overall survival (OS) as the primary endpoint. Nine machine learning models were developed in the discovery cohort and validated independently. Feature selection identified a logistic regression (LR) model incorporating white blood cell (WBC) count and lymphocyte percentage (LYMPH%) as the optimal predictor. The model achieved high performance, with an area under the ROC curve (AUC) of 0.93 (95 %CI: 0.90-0.97), area under the precision-recall curve (AUPRC) of 0.94 (95 %CI: 0.89-0.99), positive predictive value (PPV) of 0.87 (95 %CI: 0.75-0.98), negative predictive value (NPV) of 0.82 (95 %CI: 0.78-0.87), accuracy of 0.83 (95 %CI: 0.80-0.88), and F1 score of 0.82 (95 %CI: 0.79-0.86) in the discovery cohort, and comparable results in the validation cohort (AUC 0.88 [95 %CI: 0.84-0.93], AUPRC 0.81 [95 %CI: 0.75-0.86], PPV 0.77 [95 %CI: 0.71-0.84], NPV 0.89 [95 %CI: 0.84-0.95], accuracy 0.84 [95 %CI: 0.80-0.89], and F1 score 0.80 [95 %CI: 0.74-0.87]). Decision curve analysis demonstrated consistent net benefits, while Kaplan-Meier analysis indicated significantly shorter OS in the "predict worse outcomes" subgroup. Posttreatment, WBC counts increased and LYMPH% decreased in deceased patients, whereas survivors showed the opposite trends (P < 0.05). These findings suggest that a simple, cost-effective CBC-based machine learning model can effectively predict UC prognosis, aiding clinical decision-making.

Authors

Jie Cheng

State Key Laboratory of Animal Nutrition, College of Animal Science and Technology, China Agricultural University, Beijing, China.
Fei Chen

Department of Electrical and Electronic Engineering, Southern University of Science and Technology, Shenzhen, China.
Yunxiao Song

Department of Clinical Laboratory, Shanghai Xuhui Central Hospital, Fudan University, Shanghai, China. xzxsh@sina.com.
Siyang Wang

Department of Geratology, Shanghai Xuhui Central Hospital/Xuhui Hospital, Fudan University, Shanghai 200031, China.
Jingying Jia

Department of Central Laboratory, Shanghai Xuhui Central Hospital/Xuhui Hospital, Fudan University, Shanghai 200031, China.
Hang Wang

Key Subject Laboratory of Nuclear Safety and Simulation Technology, Harbin Engineering University, Harbin, 150001, China. Electronic address: wanghang1990312@126.com.
Houbao Liu

Zhongshan Hospital, Fudan University, Shanghai 200032, China. Electronic address: zsliuhb@sina.cn.

Keywords

Aged Aged, 80 and over Blood Cell Count Cohort Studies Female Humans Machine Learning Male Middle Aged Prognosis Retrospective Studies Urologic Neoplasms

External Resources

View on PubMed Access via DOI PubMed (40381672)

Development and validation of a machine learning model based on complete blood counts to predict clinical outcomes in urothelial carcinoma patients.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals