A Physics-Guided Neural Network for Predicting Protein-Ligand Binding Free Energy: From Host-Guest Systems to the PDBbind Database.

Journal: Biomolecules
Published Date:

Abstract

Calculation of protein-ligand binding affinity is a cornerstone of drug discovery. Classic implicit solvent models, which have been widely used to accomplish this task, lack accuracy compared to experimental references. Emerging data-driven models, on the other hand, are often accurate yet not fully interpretable and also likely to be overfitted. In this research, we explore the application of Theory-Guided Data Science in studying protein-ligand binding. A hybrid model is introduced by integrating Graph Convolutional Network (data-driven model) with the GBNSR6 implicit solvent (physics-based model). The proposed physics-data model is tested on a dataset of 368 complexes from the PDBbind refined set and 72 host-guest systems. Results demonstrate that the proposed Physics-Guided Neural Network can successfully improve the "accuracy" of the pure data-driven model. In addition, the "interpretability" and "transferability" of our model have boosted compared to the purely data-driven model. Further analyses include evaluating model robustness and understanding relationships between the physical features.

Authors

  • Sahar Cain
    Department of Computer Science, California State University, Los Angeles, CA 90032, USA.
  • Ali Risheh
    Department of Computer Engineering, Amirkabir University of Technology, Tehran 15914, Iran.
  • Negin Forouzesh
    Department of Computer Science, California State University, Los Angeles, CA 90032, USA.