Assessment of monthly runoff simulations based on a physics-informed machine learning framework: The effect of intermediate variables in its construction.

Journal: Journal of environmental management
Published Date:

Abstract

Hydrological forecasting is of great importance for water resources management and planning, especially given the increasing occurrence of extreme events such as floods and droughts. The physics-informed machine learning (PIML) models effectively integrate conceptual hydrologic models with machine learning (ML) models. In this process, the intermediate variables of PIML models serve as bridges between inputs and outputs, while the impact of intermediate variables on the performance of PIML models remains unclear. To fill this knowledge gap, this study aims to encompass the construction of PIML models based on various hydrologic models, conduct comparative analyses of different intermediate variables based on a case study of 205 CAMELS basins, and further explore the relationship between the performance of PIML models and catchment characteristics. The optimal ML model for constructing PIML is first selected among four ML models within the 205 basins. The PIML models are then developed based on five monthly water balance models, namely TM, XM, MEP, SLM, and TVGM. To quantify the potential impact of difference in intermediate variables, two sets of experiments are further designed and performed, namely S1 with actual evapotranspiration as the intermediate variable and S2 with soil moisture as the intermediate variable. Results show that five PIML models generally outperformed the optimal standalone ML models, i.e., the Lasso model. Specifically, regardless of the choice of intermediate variables, the PIML-XM model consistently outperformed the other models within the same basins. Almost all constructed PIML models are affected by the intermediate variables in monthly runoff simulations. Typically, S1 exhibited better performance compared to S2. A greater impact of aridity index, forest fraction, and catchment area on model performance is observed in S2. These findings improve our understanding of constructing PIML models in hydrology by emphasizing their excellent performance in runoff simulations and highlighting the importance of intermediate variables.

Authors

  • Chao Deng
    School of Mechanical Science & Engineering, Huazhong University Of Science & Technology, 1037 Luoyu Road, Wuhan, China. Electronic address: dengchao@hust.edu.cn.
  • Peiyuan Sun
    The National Key Laboratory of Water Disaster Prevention, Hohai University, Nanjing 210098, China; College of Hydrology and Water Resources, Hohai University, Nanjing 210098, China.
  • Xin Yin
    3School of Software & Microelectronics, Peking University, Beijing, 102600 China.
  • Jiacheng Zou
    Hydrology and Water Resources Monitoring Center of Lower Ganjiang River, Yichun 336000, China.
  • Weiguang Wang
    Decision, Operations and Information Technologies Department, Robert H. Smith School of Business, University of Maryland, College Park, Maryland, USA.