Benchmarking 3D Structure-Based Molecule Generators.

Journal: Journal of chemical information and modeling
Published Date:

Abstract

To understand the benefits and drawbacks of 3D combinatorial and deep learning generators, a novel benchmark was created focusing on the recreation of important protein-ligand interactions and 3D ligand conformations. Using the BindingMOAD data set with a hold-out blind set, the sequential graph neural network generators, Pocket2Mol and PocketFlow, diffusion models, DiffSBDD and MolSnapper, and combinatorial genetic algorithms, AutoGrow4 and LigBuilderV3, were evaluated. It was discovered that deep learning methods fail to generate structurally valid molecules and 3D conformations, whereas combinatorial methods are slow and generate molecules that are prone to failing 2D MOSES filters. The results from this evaluation guide us toward improving deep learning structure-based generators by placing higher importance on structural validity, 3D ligand conformations, and recreation of important known active site interactions. This benchmark should be used to understand the limitations of future combinatorial and deep learning generators. The package is freely available under an Apache 2.0 license at github.com/gskcheminformatics/SBDD-benchmarking.

Authors

  • Natasha Sanjrani
    Department of Cheminformatics, Research Technologies, GSK, Gunnels Wood Road, Stevenage SG1 2NY, U.K.
  • Damien E Coupry
    Department of Cheminformatics, Research Technologies, GSK, Gunnels Wood Road, Stevenage SG1 2NY, U.K.
  • Peter Pogány
    GSK Medicines Research Centre, Gunnels Wood Road, Stevenage SG1 2NY, U.K.
  • David S Palmer
  • Stephen D Pickett
    GSK Medicines Research Centre, Gunnels Wood Road, Stevenage SG1 2NY, U.K.