Machine learning analysis of rivaroxaban solubility in mixed solvents for application in pharmaceutical crystallization.

Journal: Scientific reports
PMID:

Abstract

This study investigates the use of machine learning models to predict solubility of rivaroxaban in binary solvents based on temperature (T), mass fraction (w), and solvent type. Using a dataset with over 250 data points and including solvents encoded with one-hot encoding, four models were compared: Gradient Boosting (GB), Light Gradient Boosting (LGB), Extra Trees (ET), and Random Forest (RF). The Jellyfish Optimizer (JO) algorithm was applied to tune hyperparameters, enhancing model performance. The LGB model achieved the best results, with an R of 0.988 on the test set and low error rates (RMSE of 9.1284E-05 and MAE of 5.85322E-05), surpassing other models in predictive accuracy and generalizability. Parity plots confirmed the LGB model's close alignment between predicted and actual solubility values, highlighting its robust performance. Furthermore, 3D surface plots and partial effect plots demonstrated LGB's capacity to model solubility across different solvent systems, capturing complex interactions between T, w, and solvent effects. Finally, the LGB model predicted maximum solubility at a temperature of 305.76 K and a mass fraction of 0.753 in a dichloromethane + methanol mixture, providing valuable insights for solubility optimization in solvent selection. This work underscores the effectiveness of the LGB model for solubility prediction, with potential applications in formulation and experimental planning.

Authors

  • Mohammed Alqarni
    Department of Pharmaceutical Chemistry, College of Pharmacy, Taif University, P.O. Box 11099, Taif 21944, Saudi Arabia.
  • Ali Alqarni
    Department of Oral & Maxillofacial Surgery and Diagnostic Sciences, Faculty of Dentistry, Taif University, Taif, Saudi Arabia.