Active Learning FEP: Impact on Performance of AL Protocol and Chemical Diversity.
Journal:
Journal of chemical theory and computation
Published Date:
May 13, 2025
Abstract
Active learning using models built on binding potency predictions from free energy perturbation (AL-FEP) has been proposed as a method for generating machine learning models capable of predicting biochemical potency for early-stage lead optimization where limited measured data are available. Two applications of AL-FEP are described here for different bromodomain inhibitor series that were developed in historic GSK projects: one where the core is kept constant and the other where core changes are included in the pool of compound ideas. Measured biochemical potency data have been used to assess the performance of the final models and demonstrate that well-performing models can be generated within several rounds of active learning, especially when the core is kept constant. To apply this method routinely to drug discovery projects, a retrospective evaluation of the AL-FEP workflow has been conducted covering parameters including the compound selection strategy, explore-exploit ratios, and number of compounds selected per cycle. Significant differences in performance in terms of model enrichment and are observed and rationalized. Recommendations are made as to when specific parameters should be employed for AL-FEP depending on the context (maximizing potency or broad-range prediction accuracy) in which the final model is to be deployed.
Authors
Keywords
No keywords available for this article.