When Simulations Meet Machine Learning: Redefining Molecular Docking for Protein-Glycosaminoglycan Systems.
Journal:
Journal of computational chemistry
Published Date:
Jun 30, 2025
Abstract
Glycosaminoglycans (GAGs) are linear, negatively charged carbohydrates that modulate enzymatic activity in the extracellular matrix. Their high flexibility and specificity in protein-GAG interactions pose challenges for both experimental and computational studies. Here, the repulsive scaling replica exchange molecular dynamics (RS-REMD) method, combined with molecular mechanics generalized born surface area (MM-GBSA), was implemented using the CHARMM36m force field to evaluate its ability to guide ligands to their native binding sites in seven protein-GAG/carbohydrate complexes. A five machine learning (ML)-based models including fully connected neural network (FCNN), linear regression, LightGBM, random forest and support vector regressor (SVR) were also trained to predict binding accuracy (RMSatd) based on MM-GBSA energy components, protein-GAG distances, and hydrogen bond counts. While MM-GBSA values showed weak to moderate correlations with RMSatd, most of the trained AI models significantly improved the selection of native-like binding poses with Random Forest model providing most accurate predictions. This study highlights the potential of integrating simulations with ML to refine molecular docking for flexible ligands like GAGs.