Machine learning based on pangenome-wide association studies reveals the impact of host source on the zoonotic potential of closely related bacterial pathogens.
Journal:
Communications biology
Published Date:
Aug 20, 2025
Abstract
Variations in host species significantly impact bacterial growth traits and antibiotic resistance, making it essential to consider host origin when evaluating the zoonotic potential of pathogens. This study focuses on multiple Brucella species, which share highly similar genetic material, to explore the relationship between host origin and zoonotic potential by integrating pan-genome-wide association studies (pan-GWAS) with machine learning (ML). Our results present an open pangenome of Brucella spp. derived from the whole-genome sequencing (WGS) data of 991 strains and identify 268 genes potentially associated with the zoonotic potential of Brucella. Integrating these genes into an ML model based on the support vector machine (SVM) algorithm allows us to predict the zoonotic potential of various Brucella strains with high accuracy. Our findings reveal that zoonotic potential varies by host origin: Brucella melitensis strains isolated from humans exhibit higher zoonotic potential than those isolated from cattle, goats, and sheep, while Brucella suis biovar 2 strains isolated from domestic pigs display higher zoonotic potential than those isolated from wild boars. Our study proposes a method for predicting and quantifying the zoonotic potential of closely related bacterial pathogens from different host origins, providing valuable insights for risk assessment and public health strategy.