Prediction of antimicrobial resistance in with a machine learning classifier based on WGS data.
Journal:
Microbiology spectrum
Published Date:
Aug 5, 2025
Abstract
UNLABELLED: The phenomenon of antimicrobial resistance (AMR) often results in treatment failure and restrictions on precision medicine, emphasizing the need for molecular diagnosis of drug resistance. The current use of machine learning (ML) techniques based on whole genome sequencing (WGS) data offers a more precise prediction of phenotypes. We incorporated WGS data from 3979 strains in our study. We modeled 10 common antibiotics using three types of features: gene, single nucleotide polymorphism (SNP), and k-mer to identify the best model and to determine which feature values most significantly contributed to the model's performance. The area under the curve (AUC) values of 40 mL models for 10 antibiotics ranged from 0.8345 to 0.9995. We noted that the performance indices such as the AUC of the gene model (0.9311-0.9992) and the integrated model (0.9313-0.9995) were markedly better than the SNP model (0.8345-0.9933) and the k-mer model (0.9024-0.9969). The best model AUC values for six antibiotics-cefoxitin, tetracycline, methicillin, gentamicin, erythromycin, and clindamycin-were over 0.99; nine antibiotic models had AUC values over 0.96, and all could effectively predict AMR phenotypes. Additionally, we discovered that certain non-AMR genes, such as the X998_03220 gene, significantly contributed to drug resistance prediction and overlapped in various antibiotic-related models simultaneously. Our study developed ML models that can reliably predict AMR phenotypes for commonly used antibiotics in . We also identified potential molecular markers that can contribute to precision medicine implementation and healthcare cost reduction.
Authors
Keywords
No keywords available for this article.