Using a deep convolutional network to predict the longitudinal dispersion coefficient.

Journal: Journal of contaminant hydrology
Published Date:

Abstract

Given the interest in accurately predicting the Longitudinal Dispersion Coefficient (D) within the fields of hydraulic and water quality modeling, a wide range of methods have been used to estimate this parameter. In order to improve the accuracy of D predictions, this paper proposes the use of a Deep Convolutional Network (DCN), a sub-field of machine learning. The proposed deep neural network architecture consists of two parts; first, a one-dimensional convolutional neural network (CNN) to build informative feature maps, and second, a stack of deep, fully connected layers to estimate pollution dispersion (as D) in streams. To accurately predict D the developed model draws upon a large and diverse array of datasets in the form of three dimensionless parameters: Width/Depth (W/H), Velocity/Shear Velocity (U/u*), and Longitudinal Dispersion Coefficient/(Depth * Shear Velocity) (D /Hu*). The model's accuracy is compared to that of several empirical models using a number of statistical measures. In addition, the DCN model results are compared with artificial neural network (ANN) and support vector machine (SVM) models implemented in this research and also similar studies applying various machine learning models (ML) towards D prediction. The statistical evaluation indicates that the DCN model outperforms the tested empirical, ANN, SVM and ML models with a significant difference. Additionally, five-fold cross-validation is performed to analyze the sensitivity and dependency of the DCN model's results on dataset selection, which shows that the dataset selection process does not significantly affect the model's accuracy. Since both ML and empirical models are, in general, poor predictors of the upper and lower ranges of D values, the DCN model's predictions of D in six different extreme-value ranges are assessed. The DCN model shows excellent accuracy in estimating D over the full possible range of data. In comparison with the empirical and ML models mentioned above, the DCN model more accurately predicts D values from river geometry and hydraulic datasets, with low errors across all ranges of D. The most significant advantage of DCN is that it tries to learn high-level features from data in an incremental manner.

Authors

  • Behzad Ghiasi
    School of Environment, College of Engineering, University of Tehran, Tehran, Iran E-mail: niksokhan@ut.ac.ir.
  • Ata Jodeiri
    School of Electrical and Computer Engineering, University College of Engineering, University of Tehran, North Kargar st., Tehran 1439957131, Iran.; Graduate School of Information Science, Nara Institute of Science and Technology, 8916-5 Takayama-cho, Ikoma, Nara 630-0192, Japan. Electronic address: ata.jodeiri@ut.ac.ir.
  • Behnam Andik
    School of Environment, College of Engineering, University of Tehran, Iran.