Automated identification of serotype using MALDI-TOF mass spectrometry and machine learning techniques.

Journal: Journal of clinical microbiology
Published Date:

Abstract

UNLABELLED: serotyping is essential for epidemiological studies and clinical treatment guidance. However, traditional serological agglutination methods are time-consuming, technically complex, and difficult to adopt at scale. Matrix-assisted laser desorption ionization-time of flight mass spectrometry (MALDI-TOF MS) is a rapid and cost-effective microbial identification technique, but it cannot be used to differentiate serotypes. This study aims to integrate MALDI-TOF MS with machine learning algorithms to develop and validate a model for serotype identification, improving efficiency and simplifying workflows. A total of 692 isolates from Children's Hospital, Zhejiang University School of Medicine (ZUCH) and Wanbei Coal-Electricity Group General Hospital (WCGH) were analyzed using MALDI-TOF MS, generating 2,048 spectra. The ZUCH data were randomly divided into training and internal validation sets. The WCGH data were used as an external validation set. Ten machine learning algorithms were evaluated for their ability to identify eight serotypes (B, C1, C2/3, D, E, Not A-F, Typhimurium, and Enteritidis). From 192 initial features, 16 features were selected for the final model construction. XGBoost demonstrated the best discriminative ability (area under the receiver operating characteristic curve [AUC] = 0.9898, sensitivity = 0.88, and specificity = 0.98) for the training set. The streamlined XGBoost model achieved AUCs of 0.9662 and 0.9778 for the internal and external validation sets, respectively, accurately identifying serotypes. To enhance usability, the model was deployed as a Streamlit-based application, facilitating interaction and broader application. MALDI-TOF MS combined with XGBoost provides a fast and accurate method for serotype identification, offering an efficient solution for laboratory diagnostics and epidemiological studies.

Authors

  • Jun Ren
    School of Life Science and Technology, Harbin Institute of Technology, Harbin, 150001, People's Republic of China.
  • Jintao Xia
    Department of Clinical Laboratory, Children's Hospital, Zhejiang University School of Medicine, National Clinical Research Center for Child Health, Hangzhou, Zhejiang, China.
  • Mengyu Zhang
    School of Pharmacy, Southwest Medical University, Luzhou 646000, China.
  • Chunhong Liu
    Department of Clinical Laboratory, Eye & ENT Hospital, Fudan University, Shanghai, China.
  • Yuanyuan Xu
    Shanghai Lung Tumor Clinical Medical Center, Shanghai Chest Hospital, Shanghai Jiao Tong University (SJTU), Shanghai 200030, China.
  • Jianing Wu
    School of Aeronautics and Astronautics, Sun Yat-Sen University, Guangzhou, 510006, P. R. China.
  • Yingzhu Li
    Department of Clinical Laboratory, Eye & ENT Hospital, Shanghai Medical College, Fudan University, Shanghai, China.
  • Mingming Zhou
    Department of Hepatobiliary Pancreatic Tumor Center, Chongqing University Cancer Hospital, Chongqing, China.
  • Shengjie Li
    Department of Neurosurgery, The First Affiliated Hospital of Shandong First Medical University & Shandong Provincial Qianfoshan Hospital, Jinan 250014, China.
  • Wenjun Cao
    Department of Clinical Laboratory, Eye & ENT Hospital, Shanghai Medical College, Fudan University, Shanghai, China. wgkjyk@aliyun.com.