Federated Machine Learning Enables Risk Management and Privacy Protection in Water Quality.

Journal: Environmental science & technology
Published Date:

Abstract

Real-time water quality risk management in wastewater treatment plants (WWTPs) requires extensive data, and data sharing is still just a slogan due to data privacy issues. Here we show an adaptive water system federated averaging (AWSFA) framework based on federated learning (FL), where the model does not access the data but uses parameters trained by the raw data. The study collected data from six WWTPs between 2018 and 2024, and developed 10 machine learning models for each effluent indicator, with the best performance bidirectional long-term memory network (BM) as Baseline. Compared to direct training and classical federated averaging (FedAvg), AWSFA reduces the mean absolute percentage error (MAPE) of BM significantly. Analysis of input dimensions, data set size, and interpretability reveals that the performance improvement is not driven by the complexity of algorithm design but by data sharing via parameter sharing. By simulation of possible disturbances in water quality, the model remained robust when 50% of key features were missing. The study provides the way forward for data sharing and privacy preservation of water systems and offers theoretical support for the digital transformation of WWTPs in the era of big data and big model.

Authors

  • Yu-Qi Wang
    State Key Laboratory of Urban Water Resource and Environment, School of Eco-Environment, Harbin Institute of Technology, Shenzhen 518055, China.
  • Hong-Cheng Wang
    State Key Laboratory of Urban Water Resource and Environment, School of Civil and Environmental Engineering, Harbin Institute of Technology, Shenzhen 518055, PR China. Electronic address: wanghongcheng@hit.edu.cn.
  • Wen-Zhe Wang
    Department of Cardiac Surgery, Nanjing Drum Tower Hospital, Affiliated Hospital of Medical School, Nanjing University, Nanjing, China.
  • Hao-Lin Yang
    State Key Laboratory of Urban Water Resource and Environment, School of Eco-Environment, Harbin Institute of Technology, Shenzhen 518055, China.
  • Jia-Ji Chen
    CAS Key Laboratory of Environmental Biotechnology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing 100085, PR China.
  • Yu-Xin Fan
    Key Lab of Environmental Biotechnology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing 100085, China.
  • Wan-Xin Yin
    College of the Environment, Liaoning University, Shenyang 110036, PR China; State Key Laboratory of Urban Water Resource and Environment, School of Civil and Environmental Engineering, Harbin Institute of Technology, Shenzhen 518055, PR China.
  • Jia-Qiang Lv
    State Key Laboratory of Urban Water Resource and Environment, School of Civil and Environmental Engineering, Harbin Institute of Technology, Shenzhen 518055, PR China.
  • Xiao-Qin Luo
    Department of Geriatrics, The Second Xiangya Hospital of Central South University, Changsha, China.
  • Xiao Zhou
    College of Environmental Science and Engineering, Tongji University, 200092, Shanghai, China; Centre for Water Systems, College of Engineering, Mathematics and Physical Sciences, University of Exeter, Exeter, EX4 4QF, UK.
  • Ai-Jie Wang
    State Key Laboratory of Urban Water Resource and Environment, School of Civil and Environmental Engineering, Harbin Institute of Technology, Shenzhen 518055, PR China.