Machine Learning Based Quantitative Structure-Dissolution Profile Relationship.

Journal: Journal of chemical information and modeling
Published Date:

Abstract

Determining accurate drug dissolution processes in the gastrointestinal tract is critical in drug discovery as dissolution profiles provide essential information for estimating the bioavailability of orally administered drugs. While various methods have been developed to predict drug solubility based on chemical structures, no reliable tools currently exist for predicting the dissolution rate constant. This study presents a novel two-stage machine learning approach, termed Machine Learning based Quantitative Structure-Dissolution Profile Relationship, which integrates physics-informed neural networks (PINNs) and deep neural networks (DNNs) to predict drug dissolution profiles in water, with varying concentrations of surfactant Sodium Lauryl Sulfate. In the first stage, PINNs extract key dissolution parameters─namely the dissolution rate constant () and the dissolved mass fraction at saturation (ϕ)─from existing dissolution data. By leveraging a physical law governing the dissolution process, PINNs aim to enhance prediction performance and reduce data requirements. Assuming first-order kinetics of the drug dissolution process as described by the Noyes-Whitney equation, PINNs, with 8 hidden layers and 40 neurons per layer, may outperform traditional nonlinear regression by effectively filtering noise and focusing on physically meaningful data. In the second stage, these extracted parameters ( and ϕ) are used to train a DNN to predict dissolution profiles based on the drug's chemical structure and dissolution medium. Using the FDA-recommended metrics: the difference and similarity factors ( and ), the DNN─with 128 neurons in two hidden layers and a learning rate of 10─achieved an average testing accuracy of 61.7% at an 80:20 train-to-test split. Although this current accuracy is below the generally acceptable range of 70-80%, this approach shows significant potential as a low-cost, time-efficient tool for early phase drug formulation. Future improvements are expected as data quality and diversity increase.

Authors

  • Lap Au-Yeung
    Department of Mechanical Engineering, University of Alberta, Edmonton T6G 2R3, Alberta, Canada.
  • Chih-Yuan Tseng
    Sinoveda Canada Inc., Edmonton T6E 5 V2, Alberta, Canada.
  • Yun K Tam
    Sinoveda Canada Inc., Edmonton T6E 5 V2, Alberta, Canada.
  • Peichun Amy Tsai
    Department of Mechanical Engineering, University of Alberta, Edmonton T6G 2R3, Alberta, Canada.