A calibration framework toward model generalization for bacteria concentration estimation in water resource recovery facilities.

Journal: Scientific reports
PMID:

Abstract

Reduced bacteria concentrations in wastewater is a key indicator of the efficacy of water resource recovery facilities (WRRFs). However, monitoring the presence of bacterial concentrations in real time at each stage of the WRRF is challenging as it requires taking and processing water samples offline. Although few studies have been proposed to predict bacterial concentrations using data-driven models, generalizing these models to unseen data from different WRRFs remains challenging. This paper proposes a calibration approach based on neural networks to adapt the optimal models across various WRRFs in Saudi Arabia for bacterial estimation at the influent and effluent stages. The calibration relies on the out-of-distribution (OOD) framework of the physiochemical water parameters (e.g., pH, COD, TDS, turbidity, conductivity) with a design threshold chosen based on the data distribution of the received unseen samples. We propose a calibration framework that continues updating the trained neural network model for accurate bacterial concentration estimation upon receiving new samples. We tested the effectiveness of the proposed calibration scheme on four WRRF datasets in Saudi Arabia, comparing the results with before and after calibration without the OOD. Before calibration model was based on a traditional and optimal neural network approach, typically considered the conventional method for building neural networks. After calibration without OOD, the model continued retraining without explicitly checking for OOD condition. The results showed that the proposed calibration framework of the selected baseline WRRF with the OOD scheme improved [Formula: see text] and [Formula: see text] of the worst-case influent bacteria concentration before calibration and after calibration without OOD, respectively. Similarly, the worst-case effluent bacteria concentration estimation was enhanced by [Formula: see text] before calibration and [Formula: see text] after calibration without the OOD. Our findings highlight the importance of integrating the calibration framework with neural network approaches to achieve model generalization.

Authors

  • Fahad Aljehani
    Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), King Abdullah University of Science and Technology (KAUST), 23955-6900, Thuwal, Saudi Arabia. fahad.aljehani@kaust.edu.sa.
  • Ibrahima N'Doye
    Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), King Abdullah University of Science and Technology (KAUST), 23955-6900, Thuwal, Saudi Arabia.
  • Pei-Ying Hong
    Environmental Science and Engineering Program, Biological and Environmental Science and Engineering Division, King Abdullah University of Science and Technology (KAUST), 23955-6900, Thuwal, Saudi Arabia.
  • Mohammad Khalil Monjed
    Faculty of Science, Umm Al-Qura University, Makkah, Saudi Arabia.
  • Taous-Meriem Laleg-Kirati
    King Abdullah University of Science and Technology, Computational Bioscience Research Center, Thuwal 23955-6900, Saudi Arabia. Electronic address: taousmeriem.laleg@kaust.edu.sa.