Combining flow virometry with tree-based machine learning models for rapid virus particle estimation in different wastewater matrices.

Journal: Water research
Published Date:

Abstract

Enumerating virus particles (VPs) at different stages of the wastewater treatment process or along the distribution network is essential for ensuring high performance and reducing public health risks. Herein, we aimed to (i) optimize the flow virometry (FVM) protocol for use in wastewater matrices, (ii) correlate FVM data with specific virus genera of interest, and (iii) develop machine learning (ML) models for determining total VP concentration. We identified and tested a comprehensive set of parameters to determine the optimal conditions for wastewater FVM. Specifically, we tested various sample preprocessing steps to enhance FVM detection sensitivity, including the use of different nucleic acid staining dyes, surfactant addition and concentration optimization, glutaraldehyde fixation, and the effect of sample freezing before FVM analysis. Spearman's rank correlation of FVM data with virus genera concentration using a conventional qPCR-based method in 206 samples showed a positive correlation for all five virus genera, ranging from 0.21 to 0.44 (p < 0.01). The extreme gradient-boosting (XGB) model using easily accessible physiochemical water parameters (such as turbidity, electroconductivity, total dissolved solids, total suspended solids, pH, chemical oxygen demand, and concentrations of nitrate nitrogen, nitrite nitrogen, and ammonium nitrogen) as input data outperformed the random forest (RF) model and can be used to estimate total virus count across all types of wastewater matrices as output data. Furthermore, XGB achieved a better root mean square error in the four treatment processes (influent, aerobic, sand, and MBR) by a mean of 23 % than RF in model development. This study demonstrates that FVM, combined with ML, can significantly enhance monitoring capabilities by accurately estimating VP concentrations across diverse wastewater matrices.

Authors

  • Yevhen Myshkevych
    Environmental Science and Engineering Program, Division of Biological and Environmental Science and Engineering, King Abdullah University of Science and Technology, Thuwal 23955, Saudi Arabia; KAUST Center of Excellence on Smart Health, King Abdullah University of Science and Technology, Thuwal 23955, Saudi Arabia.
  • Ibrahima N'Doye
    Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), King Abdullah University of Science and Technology (KAUST), 23955-6900, Thuwal, Saudi Arabia.
  • Julie Sanchez Medina
    Environmental Science and Engineering Program, Division of Biological and Environmental Science and Engineering, King Abdullah University of Science and Technology, Thuwal 23955, Saudi Arabia.
  • Fahad K Aljehani
    Electrical and Computer Engineering Program, Division of Computer, Electrical and Mathematical Science and Engineering, King Abdullah University of Science and Technology, Thuwal, 23955, Saudi Arabia.
  • Yanghui Xiong
    Environmental Science and Engineering Program, Division of Biological and Environmental Science and Engineering, King Abdullah University of Science and Technology, Thuwal 23955, Saudi Arabia.
  • Taous-Meriem Laleg-Kirati
    King Abdullah University of Science and Technology, Computational Bioscience Research Center, Thuwal 23955-6900, Saudi Arabia. Electronic address: taousmeriem.laleg@kaust.edu.sa.
  • Pei-Ying Hong
    Environmental Science and Engineering Program, Biological and Environmental Science and Engineering Division, King Abdullah University of Science and Technology (KAUST), 23955-6900, Thuwal, Saudi Arabia.

Keywords

No keywords available for this article.