Machine learning for early diagnosis of Kawasaki disease in acute febrile children: retrospective cross-sectional study in China.
Journal:
Scientific reports
PMID:
40000757
Abstract
Early diagnosis of Kawasaki disease (KD) allows timely treatment to be initiated, thereby preventing coronary artery aneurysms in children. However, it is challenging due to the subjective nature of the diagnostic criteria. This study aims to develop a machine learning prediction model using routine blood tests to distinguish children with KD from other febrile illnesses in Chinese children within the first five days of fever onset. The retrospective cross-sectional data for this study was collected from the records of Guangzhou Women and Children's Medical Center, spanning January 1, 2020, to April 30, 2024. A retrospective analysis was performed using three machine learning models and five ensemble models based on this dataset. This study included 1,089 children with KD (mean age 32.8 ± 27.0 months; 34.5% female) and a control group of 81,697 children without KD (mean age 45.3 ± 33.6 months; 42.8% female). The supervised method, Xtreme Gradient Boosting (XGBoost), was applied. It was tested without feature selection, achieved an area under the ROC curve (AUC) of 0.9999, sensitivity of 0.9982, specificity of 0.9975, F1 score of 0.9979, accuracy of 0.9979, positive predictive value (PPV) of 0.9975, and negative predictive value (NPV) of 0.9982. The SHapley Additive exPlanations (SHAP) summary plot identified the top five significant features, which were the percentage of eosinophils (EO%), hematocrit (HCT), platelet crit (PCT), gender, and absolute basophil count (BA#). This study demonstrates that the application of the machine learning model, XGBoost, on routine blood test results can predict KD.