Machine learning and deep learning enabled fuel sooting tendency prediction from molecular structure.

Journal: Journal of molecular graphics & modelling
PMID:

Abstract

Soot formation models become increasingly important in advanced renewable fuels formulation for soot reduction benefit. This work evaluates performance of machine learning (ML) and deep learning (DL) to predict yield sooting index (YSI) from chemical structure and proposes a tailor-made convolution neural network (CNN)-SDSeries38 for regression problem. In ML, a novel quantitative structure-property relationship (QSPR) is developed for feature extraction and the relationship between molecular structure and YSI is built by ML algorithm. In DL, SDSeries38 contains 9 feature learning modules, 1 regression module for automated feature learning and regression. It adopts standard series network architecture and modular structure, each feature learning module is a stack of convolution, batch normalization, activation, pooling layers. ML-QSPR model outperforms SDSeries38 in accuracy (RMSE = 7.563 vs 19.58), computational speed and the former applies to fuel mixtures. In DL, SDSeries38 network exceeds 10 classical CNN and provides a generic architecture enabling transfer application to other regression problem. DL application to regression is still in its infancy and there is no complete guide on how to develop specific CNN architectures for regression. Some gaps need to be filled: (1) Specially developed CNN architectures for regression are required; (2) The performances of direct transfer learning the classical CNN architectures from classification to regression are modest. A modular structure with typical function modules may provide an ideal solution; (3) Going deeper into the sequence of convolution layers improves predictive accuracy, but bears in mind to keep the number of layers below the threshold to avoid vanishing gradient.

Authors

  • Runzhao Li
    Department of Mechanical Engineering, School of Engineering, College of Engineering and Physical Sciences, University of Birmingham, Edgbaston, Birmingham, B15 2TT, United Kingdom.
  • Jose Martin Herreros
    Department of Mechanical Engineering, School of Engineering, College of Engineering and Physical Sciences, University of Birmingham, Edgbaston, Birmingham, B15 2TT, United Kingdom.
  • Athanasios Tsolakis
    Department of Mechanical Engineering, School of Engineering, College of Engineering and Physical Sciences, University of Birmingham, Edgbaston, Birmingham, B15 2TT, United Kingdom. Electronic address: a.tsolakis@bham.ac.uk.
  • Wenzhao Yang
    Shenzhen Gas Corporation Ltd., No.268, Meiao 1st Road, Futian District, Shenzhen, 518049, China.