Identification of key feature variables and prediction of harmful algal blooms in a water diversion lake based on interpretable machine learning.

Journal: Environmental research
Published Date:

Abstract

Harmful algal blooms (HABs) as an increasing environmental problem in lakes, and water diversion has become a common and effective strategy for mitigating HABs. Early and accurate identification of the occurrence of HABs in lakes is essential for scientific guidance of water diversion. Furthermore, the inevitable changes of hydrodynamic and water environment in the receiving area during water diversion make it more challenging to identify the important environmental features of HABs. Therefore, we constructed a machine learning modelling framework suitable for predicting HABs with favorable performance in both non-water diversion and water diversion states. In this study, we collected data from three monitoring sites for the years 2008-2020 (non-water diversion period from 2008 to 2013 and water diversion period from 2014 to 2020) as external validations and six sampling sites for the years 2021-2022 (2021 non-water diversion period and 2022 water diversion period) as internal validation. The CatBoost (AUC = 0.948) model fared best performance was obtained by comparing 10 machine learning models for comprehensive HABs prediction analyses in the external cohorts of Yilong Lake, and the 24 features were reduced to obtain the 8 (Including TP, TN and COD, etc.) most important environmental features. In addition, the SHapley Additive explanation (SHAP) method was used to interpret this CatBoost model through a global interpretation that describes the whole features of the model and a local interpretation that details how a certain forecast of HABs is made for a single sample via inputting the individual data. The CatBoost interpretable model also performed well in internal validation and the model has been converted into a convenient application for use by the Bureau of Yilong Lake Administration personnel and researchers. Finally, the results of the PLS-PM explains that water diversion indirectly mitigates HABs mainly through diluting nutrient concentrations. Overall, the final model of this study has a good performance and application benefits in predicting HABs during the non-water diversion period and water diversion period of Yilong Lake, which provides a guideline for water diversion. Furthermore, this study also provides a reference for other similar eutrophic lake water diversion strategies.

Authors

  • Yundong Wu
    Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan, 430072, PR China; University of Chinese Academy of Sciences, Beijing, 100049, PR China.
  • Bo Xian
    Department of Neurology, Sichuan Provincial People's Hospital, School of Medicine, University of Electronic Science and Technology of China, Chengdu, China.
  • Xiaowei Xiang
    Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan, 430072, PR China; University of Chinese Academy of Sciences, Beijing, 100049, PR China.
  • Fang Fang
    Department of Cardiology, Central War Zone General Hospital of the Chinese People's Liberation Army, Wuhan 430061, China.
  • Fuhao Chu
    Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan, 430072, PR China; University of Chinese Academy of Sciences, Beijing, 100049, PR China.
  • Xingkang Deng
    Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan, 430072, PR China; School of Environmental Studies, China University of Geosciences, Wuhan, 430074, PR China.
  • Qing Hu
    School of Mathematics and Statistics, Lanzhou University, Lanzhou, 730000, China. Electronic address: huq21@lzu.edu.cn.
  • Xiuqiong Sun
    Bureau of Yilong Lake Administration, Shiping, 662200, PR China.
  • Wei Tang
    Hepato-Biliary-Pancreatic Surgery Division, Department of Surgery, Graduate School of Medicine, The University of Tokyo, Tokyo, Japan.
  • Shaopan Bao
    Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan, 430072, PR China; University of Chinese Academy of Sciences, Beijing, 100049, PR China.
  • Genbao Li
    Key Laboratory of Algal Biology, Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan, 430072, PR China. Electronic address: ligb@ihb.ac.cn.
  • Tao Fang
    Department of Precision Machinery and Precision Instrumentation, University of Science and Technology of China, Hefei, Anhui, 230026, People's Republic of China.