Comparison of machine learning models for mucopolysaccharidosis early diagnosis using UAE medical records.

Journal: Scientific reports
Published Date:

Abstract

Rare diseases, such as Mucopolysaccharidosis (MPS), present significant challenges to the healthcare system. Some of the most critical challenges are the delay and the lack of accurate disease diagnosis. Early diagnosis of MPS is crucial, as it has the potential to significantly improve patients' response to treatment, thereby reducing the risk of complications or death. This study evaluates the performance of different machine learning (ML) models for MPS diagnosis using electronic health records (EHR) from the Abu Dhabi Health Services Company (SEHA). The retrospective cohort comprises 115 registered patients aged ≤ 19 Years old from 2004 to 2022. Using nested cross-validation, we trained different feature selection algorithms in combination with various ML algorithms and evaluated their performance with multiple evaluation metrics. Finally, the best-performing model was further interpreted using feature contributions analysis methods such as Shapley additive explanations (SHAP) and Local Interpretable Model-Agnostic Explanations (LIME). We found that Naive Bayes trained on the domain expert selected features reported a superior performance with an accuracy of 0.93 (0.08), AUC of 0.96 (0.04), F1-score of 0.91 (0.1), and MCC of 0.86 (0.16). SHAP and LIME analysis that were conducted on the best-performing model highlighted key features related to dental manifestations and respiratory infections which are commonly presented in MPS patients, such as acute gingivitis, accretions on teeth, dental caries, acute pharyngitis, acute tonsillitis, and acute bronchitis. This study introduces a cost-effective screening approach for MPS disease using non-invasive EHR, which contributes to the advances in digital screening tools for the early diagnosis of rare diseases.

Authors

  • Aamna AlShehhi
    Department of Biomedical Engineering, Khalifa University, PO Box 127788, Abu Dhabi, United Arab Emirates. aamna.alshehhi@ku.ac.ae.
  • Hiba Alblooshi
    ASPIRE Precision Medicine Research Institute, United Arab Emirates University, Abu Dhabi, United Arab Emirates.
  • Ruba Fadul
    Department of Biomedical Engineering and Biotechnology, Khalifa University, Abu Dhabi, United Arab Emirates.
  • Natnael Tumzghi
    Department of Biomedical Engineering and Biotechnology, Khalifa University, Abu Dhabi, United Arab Emirates.
  • Amal Al Tenaiji
    Department of Pediatrics, Sheikh Khalifa Medical City, Abu Dhabi, United Arab Emirates.
  • Mariam Al Harbi
    Research Department, SEHA-Corporate Medical and Clinical Affairs, Abu Dhabi, United Arab Emirates.
  • Fatma Al-Jasmi
    Health Data Science Lab, Department of Genetics and Genomics, College of Medical and Health Sciences, United Arab Emirates University, Tawam road, Al Maqam district, Al Ain, Abu Dhabi, United Arab Emirates.