From Static to Dynamic Structures: Improving Binding Affinity Prediction with Graph-Based Deep Learning.

Journal: Advanced science (Weinheim, Baden-Wurttemberg, Germany)
Published Date:

Abstract

Accurate prediction of protein-ligand binding affinities is an essential challenge in structure-based drug design. Despite recent advances in data-driven methods for affinity prediction, their accuracy is still limited, partially because they only take advantage of static crystal structures while the actual binding affinities are generally determined by the thermodynamic ensembles between proteins and ligands. One effective way to approximate such a thermodynamic ensemble is to use molecular dynamics (MD) simulation. Here, an MD dataset containing 3,218 different protein-ligand complexes is curated, and Dynaformer, a graph-based deep learning model is further developed to predict the binding affinities by learning the geometric characteristics of the protein-ligand interactions from the MD trajectories. In silico experiments demonstrated that the model exhibits state-of-the-art scoring and ranking power on the CASF-2016 benchmark dataset, outperforming the methods hitherto reported. Moreover, in a virtual screening on heat shock protein 90 (HSP90) using Dynaformer, 20 candidates are identified and their binding affinities are further experimentally validated. Dynaformer displayed promising results in virtual drug screening, revealing 12 hit compounds (two are in the submicromolar range), including several novel scaffolds. Overall, these results demonstrated that the approach offer a promising avenue for accelerating the early drug discovery process.

Authors

  • Yaosen Min
    Institute for Interdisciplinary Information Sciences, Tsinghua University, 100084, Beijing, China.
  • Ye Wei
    Max-Planck-Institut für Eisenforschung GmbH, Max-Planck-Straße 1, Düsseldorf, Germany.
  • Peizhuo Wang
    Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, 100084, China.
  • Xiaoting Wang
    He University, Shenyang, 110000, China.
  • Han Li
  • Nian Wu
    Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, 100084, China.
  • Stefan Bauer
    Department of Computer Science, ETH Zurich, Zürich, Switzerland.
  • Shuxin Zheng
    School of Economics and Business, Changzhou Vocational Institute of Textile and Garment, Changzhou, China.
  • Yu Shi
    NIH BD2K Program Centers of Excellence for Big Data Computing-KnowEng Center, Department of Computer Science, University of Illinois at Urbana-Champaign , Champaign, Illinois.
  • Yingheng Wang
    Department of Electrical Engineering, Tsinghua University, Beijing, 100084, China.
  • Ji Wu
    Department of Urology, Nanchong Central Hospital, Nanchong, Sichuan, China.
  • Dan Zhao
    Key Laboratory of Hunan Province for Water Environment and Agriculture Product Safety, College of Chemistry and Chemical Engineering, Central South University, Changsha, 410083, China.
  • Jianyang Zeng
    Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing 100084, China; MOE Key Laboratory of Bioinformatics, Tsinghua University, Beijing 100084, China. Electronic address: zengjy321@tsinghua.edu.cn.