Machine learning-selected minimal features drive high-accuracy rule-based antibiotic susceptibility predictions for via metagenomic sequencing.
Journal:
Microbiology spectrum
Published Date:
Jul 11, 2025
Abstract
Antimicrobial resistance (AMR) represents a critical global health challenge, demanding rapid and accurate antimicrobial susceptibility testing (AST) to guide timely treatments. Traditional culture-based AST methods are slow, while existing whole-genome sequencing (WGS)-based models often suffer from overfitting, poor interpretability, and diminished performance on clinical metagenomic data. In this study, we developed an interpretable genotypic AST approach for using minimal genomic determinants. Analysis of 4,796 . genomes and AST data for 18 antibiotics revealed one to five key resistance genes per antibiotic, including two previously uncharacterized vancomycin resistance markers. These features enabled highly accurate rule-based predictions, achieving area under the curve (AUC) values ranging from 0.94 to 1.00. The model demonstrated an overall sensitivity of 97.43% and specificity of 99.02%, respectively, with a very major error (VME) rate of 2.57% and a major error (ME) rate of 0.98% for isolate-level testing. Furthermore, after optimization for shallow-depth metagenomic sequencing, the model achieved 81.82% to 100% accuracy in AST predictions for 59 clinical samples, bypassing the need for bacterial isolation and reducing diagnostic time by an average of 39.9 hours. By combining minimal feature selection with strong interpretability and adaptability to metagenomic data, this method offers a practical and transformative solution for rapid and reliable AST in clinical settings.IMPORTANCEAntimicrobial resistance (AMR) in poses a critical challenge to global health, necessitating rapid and reliable antimicrobial susceptibility testing (AST) for timely treatment decisions. Traditional culture-based AST is slow, while existing whole-genome sequencing (WGS)-based approaches often suffer from overfitting and poor interpretability. This study introduces a rule-based, interpretable genotypic AST model for that leverages minimal genomic determinants, achieving over 97% accuracy in isolate-level testing and high accuracy in clinical metagenomic samples. By extracting key resistance features and applying a rule-based approach, our model enables faster AST predictions and enhances hospital surveillance of resistant strain outbreaks. This culture-independent method reduces diagnostic time by nearly 40 hours, providing a scalable and actionable solution for clinical AMR management.