The Impact of Data on Structure-Based Binding Affinity Predictions Using Deep Neural Networks.

Journal: International journal of molecular sciences
Published Date:

Abstract

Artificial intelligence (AI) has gained significant traction in the field of drug discovery, with deep learning (DL) algorithms playing a crucial role in predicting protein-ligand binding affinities. Despite advancements in neural network architectures, system representation, and training techniques, the performance of DL affinity prediction has reached a plateau, prompting the question of whether it is truly solved or if the current performance is overly optimistic and reliant on biased, easily predictable data. Like other DL-related problems, this issue seems to stem from the training and test sets used when building the models. In this work, we investigate the impact of several parameters related to the input data on the performance of neural network affinity prediction models. Notably, we identify the size of the binding pocket as a critical factor influencing the performance of our statistical models; furthermore, it is more important to train a model with as much data as possible than to restrict the training to only high-quality datasets. Finally, we also confirm the bias in the typically used current test sets. Therefore, several types of evaluation and benchmarking are required to understand models' decision-making processes and accurately compare the performance of models.

Authors

  • Pierre-Yves Libouban
    Institute of Organic and Analytical Chemistry (ICOA), UMR7311, Université d'Orléans, CNRS, Pôle de Chimie rue de Chartres, 45067 Orléans, CEDEX 2, France.
  • Samia Aci-Sèche
    Institute of Organic and Analytical Chemistry (ICOA), UMR7311, Université d'Orléans, CNRS, Pôle de Chimie rue de Chartres, 45067 Orléans, CEDEX 2, France.
  • Jose Carlos Gómez-Tamayo
    Computational Chemistry, Janssen Research & Development, Janssen Pharmaceutica N. V., B-2340 Beerse, Belgium.
  • Gary Tresadern
    Janssen Research and Development , Turnhoutseweg 30 , 2340 Beerse , Belgium.
  • Pascal Bonnet
    Institute of Organic and Analytical Chemistry (ICOA), UMR7311, Université d'Orléans, CNRS, Pôle de Chimie rue de Chartres, 45067 Orléans, CEDEX 2, France.