Ligand-Based Drug Discovery Leveraging State-of-the-Art Machine Learning Methodologies Exemplified by Cdr1 Inhibitor Prediction.

Journal: Journal of chemical information and modeling
PMID:

Abstract

Artificial intelligence (AI) is revolutionizing drug discovery with unprecedented speed and efficiency. In computer-aided drug design, structure-based and ligand-based methodologies are the main driving forces for innovation. In cases where no experimental structure or high-confidence homology/AlphaFold-predicted model of the target is available in 3D, ligand-based strategies are generally preferable. Here, we aim to develop and evaluate new predictive AI models for ligand-based drug discovery. To illustrate our workflow, we propose, as an example, an ensemble classification model for Cdr1 inhibitor prediction. We leverage target-specific experimental data from different sources, various molecular feature types, and multiple state-of-the-art machine learning (ML) algorithms alongside a multi-instance 3D graph neural network (multiple conformations of a single molecule are considered). Bayesian hyperparameter tuning, stacked generalization, and soft voting are involved in our workflow. The final target-specific ensemble model benefits from the classification and screening power of those constituting it. On an external test set structurally dissimilar to the training data, its average precision is 0.755, its F1-score is 0.714, the area under the receiver operating characteristic curve is 0.884, and the balanced accuracy is 0.799. It gives a low false positive rate of 0.1236 on another test set outside the training chemical space, indicating its ability to avoid false positives. The present work highlights the potential of stacking ensemble ML and offers a rigorous general workflow to build ligand-based predictive AI models for other targets.

Authors

  • The-Chuong Trinh
    Faculty of Pharmacy, Grenoble Alpes University, La Tronche, 38700, France.
  • Pierre Falson
    Drug Resistance & Membrane Proteins Group, CNRS-Lyon 1 University Laboratory, UMR 5086, IBCP, 69367 CEDEX Lyon 07, France.
  • Viet-Khoa Tran-Nguyen
    Laboratoire d'Innovation Thérapeutique, UMR 7200 CNRS-Université de Strasbourg, 67400 Illkirch, France.
  • Ahcène Boumendjel
    Univ. Grenoble Alpes, INSERM, LRB, Grenoble 38000, France.