Federated Machine Learning Enables Risk Management and Privacy Protection in Water Quality.
Journal:
Environmental science & technology
Published Date:
May 16, 2025
Abstract
Real-time water quality risk management in wastewater treatment plants (WWTPs) requires extensive data, and data sharing is still just a slogan due to data privacy issues. Here we show an adaptive water system federated averaging (AWSFA) framework based on federated learning (FL), where the model does not access the data but uses parameters trained by the raw data. The study collected data from six WWTPs between 2018 and 2024, and developed 10 machine learning models for each effluent indicator, with the best performance bidirectional long-term memory network (BM) as Baseline. Compared to direct training and classical federated averaging (FedAvg), AWSFA reduces the mean absolute percentage error (MAPE) of BM significantly. Analysis of input dimensions, data set size, and interpretability reveals that the performance improvement is not driven by the complexity of algorithm design but by data sharing via parameter sharing. By simulation of possible disturbances in water quality, the model remained robust when 50% of key features were missing. The study provides the way forward for data sharing and privacy preservation of water systems and offers theoretical support for the digital transformation of WWTPs in the era of big data and big model.