Classification of subspecies based on MALDI-TOF MS protein profiles using machine learning models.

Journal: Microbiology spectrum
PMID:

Abstract

UNLABELLED: is an important bacterial species used as a starter culture for fermented foods; however, two subspecies within this species exhibit different properties in the foods. Matrix-assisted laser desorption/ionization-time of flight mass spectrometer (MALDI-TOF MS) is the gold standard for microbial fingerprinting. However, the resolution power is down to the species level. This study was to combine MALDI-TOF mass spectra and machine learning to develop a new method to identify two subspecies ( subsp. and subsp. ) and non-. species. Totally, 227 strains were collected, with 908 spectra obtained via on- and off-plate protein extraction. Only 68.7% of strains were correctly identified at the subspecies level in the Biotyper database; however, a high level of performance was observed from the machine learning models. Partial least squares-discriminant analysis (PLS-DA), principal component analysis-K-nearest neighbor (PCA-KNN), and support vector machine (SVM) demonstrated 0.823, 0.914, and 0.903 accuracies, respectively, whereas the random forest (RF) achieved an accuracy of 0.954, with an area under the receiver operating characteristic (AUROC) curve of 0.99, outperforming the other algorithms in distinguishing the subspecies. The machine learning proved to be a promising technique for the rapid and high-resolution classification of subspecies using MALDI-TOF MS.

Authors

  • Eiseul Kim
    Institute of Life Sciences & Resources and Department of Food Science and Biotechnology, Kyung Hee University, Yongin 17104, Republic of Korea.
  • Seung-Min Yang
    Institute of Life Sciences & Resources and Department of Food Science and Biotechnology, Kyung Hee University, Yongin 17104, Republic of Korea.
  • So-Yun Lee
    Department of Food Science and Biotechnology, Institute of Life Sciences & Resources, Kyung Hee University, Yongin, South Korea.
  • Dae-Hyun Jung
    Department of Smart Farm Science, Kyung Hee University, Yongin 17104, Republic of Korea.
  • Hae-Yeong Kim
    Institute of Life Sciences & Resources and Department of Food Science and Biotechnology, Kyung Hee University, Yongin 17104, Republic of Korea. Electronic address: hykim@khu.ac.kr.