Application of high-precision solubility prediction models in the assisted design of drug-like compounds.

Journal: Molecular diversity
Published Date:

Abstract

Machine learning (ML) techniques are rapidly being applied to drug-assisted design. In order to provide more efficient methods to aid the solubility prediction aspect of drug design, two machine learning models are developed and trained with two distinct feature sets derived from the Zenodo dataset. The machine models are constructed with the multilayer perceptron as the core, combining Bayesian optimization and Monte Carlo methods to improve prediction accuracy. The training process leverages RMSprop to expedite convergence, utilizes Dropout to avert overfitting, and incorporates a Self-Attention mechanism to focus on important features. Based on the three types of compounds, the correlation coefficients all remain above 0.99 compared to the actual solubility. The average absolute errors of the solubility prediction results of the two models are less than 0.200 mol/L and 0.050 mol/L. Both trained models are capable of predicting the solubility of thousands of compounds in just 94.7 ms and 57.7 ms. Using these two models, it is possible to assist with faster and more rational design of drug-like compounds.

Authors

  • Yutong Gao
    State Key Laboratory of Oral & Maxillofacial Reconstruction and Regeneration, Key Laboratory of Oral Biomedicine Ministry of Education, Hubei Key Laboratory of Stomatology, School & Hospital of Stomatology, Wuhan University, Wuhan, China.

Keywords

No keywords available for this article.