Application of high-precision solubility prediction models in the assisted design of drug-like compounds.
Journal:
Molecular diversity
Published Date:
May 27, 2025
Abstract
Machine learning (ML) techniques are rapidly being applied to drug-assisted design. In order to provide more efficient methods to aid the solubility prediction aspect of drug design, two machine learning models are developed and trained with two distinct feature sets derived from the Zenodo dataset. The machine models are constructed with the multilayer perceptron as the core, combining Bayesian optimization and Monte Carlo methods to improve prediction accuracy. The training process leverages RMSprop to expedite convergence, utilizes Dropout to avert overfitting, and incorporates a Self-Attention mechanism to focus on important features. Based on the three types of compounds, the correlation coefficients all remain above 0.99 compared to the actual solubility. The average absolute errors of the solubility prediction results of the two models are less than 0.200 mol/L and 0.050 mol/L. Both trained models are capable of predicting the solubility of thousands of compounds in just 94.7 ms and 57.7 ms. Using these two models, it is possible to assist with faster and more rational design of drug-like compounds.
Authors
Keywords
No keywords available for this article.