When Simulations Meet Machine Learning: Redefining Molecular Docking for Protein-Glycosaminoglycan Systems.

Journal: Journal of computational chemistry
Published Date:

Abstract

Glycosaminoglycans (GAGs) are linear, negatively charged carbohydrates that modulate enzymatic activity in the extracellular matrix. Their high flexibility and specificity in protein-GAG interactions pose challenges for both experimental and computational studies. Here, the repulsive scaling replica exchange molecular dynamics (RS-REMD) method, combined with molecular mechanics generalized born surface area (MM-GBSA), was implemented using the CHARMM36m force field to evaluate its ability to guide ligands to their native binding sites in seven protein-GAG/carbohydrate complexes. A five machine learning (ML)-based models including fully connected neural network (FCNN), linear regression, LightGBM, random forest and support vector regressor (SVR) were also trained to predict binding accuracy (RMSatd) based on MM-GBSA energy components, protein-GAG distances, and hydrogen bond counts. While MM-GBSA values showed weak to moderate correlations with RMSatd, most of the trained AI models significantly improved the selection of native-like binding poses with Random Forest model providing most accurate predictions. This study highlights the potential of integrating simulations with ML to refine molecular docking for flexible ligands like GAGs.

Authors

  • Krzysztof K Bojarski
    Department of Physical Chemistry, Gdansk University of Technology, Gdansk, Poland.