Development and validation of an interpretable risk prediction model for the early classification of thalassemia.

Journal: NPJ digital medicine
Published Date:

Abstract

Thalassemia is an inherited blood disorder. Current diagnostic methods mainly rely on sophisticated equipment and specifically trained technicians. This study aims to identify and genotype thalassemia by applying machine learning (ML) algorithms to routine blood parameters. This study recruited a derivation cohort of 31,311 individuals from four independent hospitals and developed eight machine learning (ML) models for the purpose. The performance of these models was compared using a set of evaluation metrics. An additional cohort of 2000 patients was recruited for external validation to assess the generalization of the models. The results demonstrated that the categorical boosting (CatBoost) model exhibited the best discriminative ability in both the training and external validation cohorts. The model was then integrated into an online platform, which holds the potential to act as an auxiliary tool for identifying and genotyping thalassemia via automatic analysis of routine blood test parameters.

Authors

  • Jin-Xin Lai
    Guangdong Provincial People's Hospital (Guangdong Academy of Medical Sciences), Southern Medical University, Guangzhou, Guangdong Province, China.
  • Jia-Wei Tang
    Department of Intelligent Medical Engineering, School of Medical Informatics and Engineering, Xuzhou Medical University, Xuzhou, Jiangsu Province, China.
  • Shan-Shan Gong
    Department of Clinical Laboratory, Hainan General Hospital (Hainan Affiliated Hospital of Hainan Medical University), Haikou, Hainan Province, China.
  • Ming-Xiong Qin
    Guigang City People's Hospital, Guigang, Guangxi Province, China.
  • Yu-Lu Zhang
    Laboratory Medicine, Guangdong Provincial People's Hospital (Guangdong Academy of Medical Sciences), Southern Medical University, Guangzhou, Guangdong Province, China.
  • Quan-Fa Liang
    School of Laboratory and Biotechnology, Southern Medical University, Guangzhou, Guangdong Province, China.
  • Li-Yan Li
    Technology Center of Prenatal Diagnosis and Genetic Diseases Diagnosis, Department of Gynecology and Obstetrics, Nanfang Hospital, Southern Medical University, Guangzhou, Guangdong Province, China. 64484928@qq.com.
  • Zhen Cai
  • Liang Wang
    Information Department, Dazhou Central Hospital, Dazhou 635000, China.

Keywords

No keywords available for this article.