Machine learning surveillance of foodborne infectious diseases using wastewater microbiome, crowdsourced, and environmental data.

Journal: Water research
PMID:

Abstract

Clostridium perfringens (CP) is a common cause of foodborne infection, leading to significant human health risks and a high economic burden. Thus, effective CP disease surveillance is essential for preventive and therapeutic interventions; however, conventional practices often entail complex, resource-intensive, and costly procedures. This study introduced a data-driven machine learning (ML) modeling framework for CP-related disease surveillance. It leveraged an integrated dataset of municipal wastewater microbiome (e.g., CP abundance), crowdsourced (CP-related web search keywords), and environmental data. Various optimization strategies, including data integration, data normalization, model selection, and hyperparameter tuning, were implemented to improve the ML modeling performance, leading to enhanced predictions of CP cases over time. Explainable artificial intelligence methods identified CP abundance as the most reliable predictor of CP disease cases. Multi-omics subsequently revealed the presence of CP and its genotypes/toxinotypes in wastewater, validating the utility of microbiome-data-enabled ML surveillance for foodborne diseases. This ML-based framework thus exhibits significant potential for complementing and reinforcing existing disease surveillance systems.

Authors

  • Seungdae Oh
    Department of Civil Engineering, College of Engineering, Kyung Hee University, Yongin, Republic of Korea. Electronic address: soh@khu.ac.kr.
  • Haeil Byeon
    Department of Civil Engineering, College of Engineering, Kyung Hee University, Yongin, Republic of Korea.
  • Jonathan Wijaya
    Department of Civil Engineering, College of Engineering, Kyung Hee University, Yongin, Republic of Korea.