Teaching old docks new tricks with machine learning enhanced ensemble docking.

Journal: Scientific reports
Published Date:

Abstract

We here introduce Ensemble Optimizer (EnOpt), a machine-learning tool to improve the accuracy and interpretability of ensemble virtual screening (VS). Ensemble VS is an established method for predicting protein/small-molecule (ligand) binding. Unlike traditional VS, which focuses on a single protein conformation, ensemble VS better accounts for protein flexibility by predicting binding to multiple protein conformations. Each compound is thus associated with a spectrum of scores (one score per protein conformation) rather than a single score. To effectively rank and prioritize the molecules for further evaluation (including experimental testing), researchers must select which protein conformations to consider and how best to map each compound's spectrum of scores to a single value, decisions that are system-specific. EnOpt uses machine learning to address these challenges. We perform benchmark VS to show that for many systems, EnOpt ranking distinguishes active compounds from inactive or decoy molecules more effectively than traditional ensemble VS methods. To encourage broad adoption, we release EnOpt free of charge under the terms of the MIT license.

Authors

  • Roshni Bhatt
    Department of Computational and Systems Biology, University of Pittsburgh, Pittsburgh, Pennsylvania 15260, United States.
  • Ann Wang
    Department of Biological Sciences, University of Pittsburgh, Pittsburgh, PA, 15260, USA.
  • Jacob D Durrant
    Department of Chemistry & Biochemistry and the National Biomedical Computation Resource, University of California, San Diego , La Jolla, California 92093, United States.