An automated machine learning-based framework for predicting groundwater quality with sensor data.
Journal:
Journal of environmental management
Published Date:
Jul 25, 2025
Abstract
Groundwater quality monitoring stands as a critical aspect of groundwater management, necessitating real-time and accurate measurement technologies. In this study, we introduce an automated framework for predicting NH-N in groundwater using multiparameter sensor data and machine learning. Data collected from a carcass burial site in Anseong, South Korea underwent rigorous quality control, including outlier detection and calibration against laboratory measurements. We then applied automated machine learning (AutoML) to optimize NH-N prediction models using a core set of features including (NH-N, electrical conductivity, temperature, and Cl) achieving significant accuracy gains compared to raw sensor outputs. Specifically, R improved from 0.76 to 0.90, while the root mean square error (RMSE) and mean absolute error (MAE) declined from 0.84 to 0.38 and 0.57 to 0.23, respectively. External validation using datasets from two hydrogeologically distinct regions demonstrated that the proposed framework achieved consistently high predictive performance (R = 0.89-0.98; RMSE = 0.008-0.02), underscoring its robustness across diverse contamination scenarios. These findings highlight the effectiveness of combining calibrated sensor data with automated model selection for robust, continuous surveillance Our results underscore the potential for scalable, early detection strategies in sensitive environments, emphasizing how advanced analytics and automated calibration can enhance contamination alerts and support proactive groundwater management.