Using a deep convolutional network to predict the longitudinal dispersion coefficient.
Journal:
Journal of contaminant hydrology
Published Date:
Mar 19, 2021
Abstract
Given the interest in accurately predicting the Longitudinal Dispersion Coefficient (D) within the fields of hydraulic and water quality modeling, a wide range of methods have been used to estimate this parameter. In order to improve the accuracy of D predictions, this paper proposes the use of a Deep Convolutional Network (DCN), a sub-field of machine learning. The proposed deep neural network architecture consists of two parts; first, a one-dimensional convolutional neural network (CNN) to build informative feature maps, and second, a stack of deep, fully connected layers to estimate pollution dispersion (as D) in streams. To accurately predict D the developed model draws upon a large and diverse array of datasets in the form of three dimensionless parameters: Width/Depth (W/H), Velocity/Shear Velocity (U/u*), and Longitudinal Dispersion Coefficient/(Depth * Shear Velocity) (D /Hu*). The model's accuracy is compared to that of several empirical models using a number of statistical measures. In addition, the DCN model results are compared with artificial neural network (ANN) and support vector machine (SVM) models implemented in this research and also similar studies applying various machine learning models (ML) towards D prediction. The statistical evaluation indicates that the DCN model outperforms the tested empirical, ANN, SVM and ML models with a significant difference. Additionally, five-fold cross-validation is performed to analyze the sensitivity and dependency of the DCN model's results on dataset selection, which shows that the dataset selection process does not significantly affect the model's accuracy. Since both ML and empirical models are, in general, poor predictors of the upper and lower ranges of D values, the DCN model's predictions of D in six different extreme-value ranges are assessed. The DCN model shows excellent accuracy in estimating D over the full possible range of data. In comparison with the empirical and ML models mentioned above, the DCN model more accurately predicts D values from river geometry and hydraulic datasets, with low errors across all ranges of D. The most significant advantage of DCN is that it tries to learn high-level features from data in an incremental manner.