SAnDReS 2.0: Development of machine-learning models to explore the scoring function space.

Journal: Journal of computational chemistry
Published Date:

Abstract

Classical scoring functions may exhibit low accuracy in determining ligand binding affinity for proteins. The availability of both protein-ligand structures and affinity data make it possible to develop machine-learning models focused on specific protein systems with superior predictive performance. Here, we report a new methodology named SAnDReS that combines AutoDock Vina 1.2 with 54 regression methods available in Scikit-Learn to calculate binding affinity based on protein-ligand structures. This approach allows exploration of the scoring function space. SAnDReS generates machine-learning models based on crystal, docked, and AlphaFold-generated structures. As a proof of concept, we examine the performance of SAnDReS-generated models in three case studies. For all three cases, our models outperformed classical scoring functions. Also, SAnDReS-generated models showed predictive performance close to or better than other machine-learning models such as K, CSM-lig, and ΔRF. SAnDReS 2.0 is available to download at https://github.com/azevedolab/sandres.

Authors

  • Walter Filgueira de Azevedo
    Laboratory of Computational Systems Biology, School of Sciences, Pontifical Catholic University of Rio Grande do Sul, Av. Ipiranga, 6681 Partenon Porto Alegre-RS, 90619-900, Brazil.
  • Rodrigo Quiroga
    CONICET-Departamento de Matemática y Física, Instituto de Investigaciones en Fisicoquímica de Córdoba (INFIQC), Facultad de Ciencias Químicas, Universidad Nacional de Córdoba, Ciudad Universitaria, Córdoba, Argentina.
  • Marcos Ariel Villarreal
    Instituto de Investigaciones en Fisicoquímica de Córdoba (INFIQC), CONICET-Departamento de Química Teórica y Computacional, Facultad de Ciencias Químicas, Universidad Nacional de Córdoba, Ciudad Universitaria, Córdoba, Argentina.
  • Nelson José Freitas da Silveira
    Laboratory of Molecular Modeling and Computer Simulation, Federal University of Alfenas, Alfenas, Brazil.
  • Gabriela Bitencourt-Ferreira
    Laboratory of Computational Systems Biology, School of Sciences, Pontifical Catholic University of Rio Grande do Sul (PUCRS), Av. Ipiranga, 6681, Porto Alegre, RS 90619-900, Brazil.
  • Amauri Duarte da Silva
    Programa de Pós-Graduação em Tecnologias da Informação e Gestão em Saúde, Universidade Federal de Ciências da Saúde de Porto Alegre, Porto Alegre, Brazil.
  • Martina Veit-Acosta
    Western Michigan University, 1903 Western, Michigan Ave, Kalamazoo, MI49008, United States.
  • Patrícia Rufino Oliveira
  • Marco Tutone
    Department of Infectious Diseases, Azienda Ospedaliero-Universitaria Policlinico of Modena, Modena, Italy.
  • Nadezhda Biziukova
    Institute of Biomedical Chemistry, Pogodinskaya Str., 10/8, Moscow, 119121, Russia.
  • Vladimir Poroikov
    Institute of Biomedical Chemistry, Pogodinskaya Str., 10/8, Moscow, 119121, Russia.
  • Olga Tarasova
    Institute of Biomedical Chemistry, Pogodinskaya Str., 10/8, Moscow, 119121, Russia.
  • Stéphaine Baud
    Laboratoire SiRMa, UMR CNRS/URCA 7369, UFR Sciences Exactes et Naturelles, Université de Reims Champagne-Ardenne, CNRS, MEDYC, Reims, France.