Deep Learning Models Compared to Experimental Variability for the Prediction of CYP3A4 Time-Dependent Inhibition.

Journal: Chemical research in toxicology
PMID:

Abstract

Most drugs are mainly metabolized by cytochrome P450 (CYP450), which can lead to drug-drug interactions (DDI). Specifically, time-dependent inhibition (TDI) of CYP3A4 isoenzyme has been associated with clinically relevant DDI. To overcome potential DDI issues, high-throughput assays were established to assess the TDI of CYP3A4 during the discovery and lead optimization phases. However, machine learning models would enable an earlier and larger-scale assessment of TDI potential liabilities. For CYP inhibition, most modeling efforts have focused on highly imbalanced and small data sets. Moreover, assay variability is rarely considered, which is key to understand the model's quality and suitability for decision-making. In this work, machine learning models were built for the prediction of TDI of CYP3A4, evaluated prospectively, and compared to the variability of the experimental assay. Different modeling strategies were investigated to assess their influence on the model's performance. Through multitask learning, additional data sets were leveraged for model building, coming from public databases, in-house CYP-related assays, or other pharmaceutical companies (federated learning). Apart from the numerical prediction of inactivation rates of CYP3A4 TDI, three-class predictions were carried out, giving a negative (inactivation rate < 0.01 min), weak positive (0.01 ≤ ≤ 0.025 min), or positive ( > 0.025 min) output. The final multitask graph neural network model achieved misclassification rates of 8 and 7% for positive and negative TDI, respectively. Importantly, the presented deep learning-based predictions had a similar precision to the reproducibility of experiments and thus offered great opportunities for drug design, early derisk of DDI potential, and selection of experiments. To facilitate CYP inhibition modeling efforts in the public domain, the developed model was used to annotate ∼16 000 publicly available structures, and a surrogate data set is shared as Supporting Information.

Authors

  • Andrin Fluetsch
    Novartis Biomedical Research, Novartis Campus, Basel 4002, Switzerland.
  • Markus Trunzer
    Novartis Biomedical Research, Novartis Campus, CH-4002 Basel, Switzerland.
  • Grégori Gerebtzoff
    Novartis Institutes for BioMedical Research, NIBR Translational Medicine, Modeling and Simulations, Novartis Pharma AG, Novartis Campus, 4056, Basel, Switzerland.
  • Raquel Rodríguez-Pérez
    Department of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Dahlmannstr. 2, D-53113 Bonn, Germany.