The spatiotemporal evolution of dissolved-phase NAPL plumes revealed by the integrated groundwater quality and machine learning models.
Journal:
Water research
Published Date:
Mar 22, 2025
Abstract
Rapid prediction of dissolved-phase contamination plume distributions is crucial for emergency remediation of aquifers contaminated with non-aqueous phase liquids (NAPLs). However, collecting and analyzing contaminated groundwater samples is expensive and undertaken infrequently. Additionally, the heterogeneous features and complex biogeochemical reactions in aquifers often limit the application of traditional numerical modeling. This study developed a novel machine learning (ML) prediction framework incorporating sliding window-based time-series prediction and general regression prediction. The goal was to predict the spatiotemporal distribution of the dissolved-phase NAPL plumes based on low-cost and easily measured in-situ groundwater quality parameters (iWQP), including pH, dissolved oxygen, oxidation-reduction potential, and electrical conductivity. The framework was applied to hypothetical but realistic field-scale reactive transport model cases, showing different hydrogeological conditions and various dissolved-phase NAPL plumes. First, a sliding window-based Random Forest (RF) model was constructed to predict the iWQP at a target time using the historical continuous-time data of iWQP. Then, four ML models, namely RF, eXtreme Gradient Boosting, Multilayer Perceptron and Long Short-Term Memory (LSTM) were employed to predict the spatial distribution of NAPL plumes at the target time using predicted iWQP and low-frequency sampled historical datasets of dissolved-phase NAPL plumes. The prediction results revealed that the LSTM model showed the best performance (R > 0.92) and maintained temporal validity for the longest duration. Based on the permutation feature importance approach, pH was identified as the key iWQP for predicting dissolved-phase NAPL plumes. Overall, the findings inform the subsequent development of data-driven models for real-time monitoring and pre-estimation of dissolved-phase NAPL levels in groundwater using iWQP sensors, and can assist in swift decision-making for groundwater remediation in NAPL-contaminated zones.