Machine learning and ligand binding predictions: A review of data, methods, and obstacles.

Journal: Biochimica et biophysica acta. General subjects

Published Date: Feb 10, 2020

Abstract

Computational predictions of ligand binding is a difficult problem, with more accurate methods being extremely computationally expensive. The use of machine learning for drug binding predictions could possibly leverage the use of biomedical big data in exchange for time-intensive simulations. This paper reviews current trends in the use of machine learning for drug binding predictions, data sources to develop machine learning algorithms, and potential problems that may lead to overfitting and ungeneralizable models. A few popular datasets that can be used to develop virtual high-throughput screening models are characterized using spatial statistics to quantify potential biases. We can see from evaluating some common benchmarks that good performance correlates with models with high-predicted bias scores and models with low bias scores do not have much predictive power. A better understanding of the limits of available data sources and how to fix them will lead to more generalizable models that will lead to novel drug discovery.

Authors

Sally R Ellingson

College of Medicine, Division of Biomedical Informatics, University of Kentucky, Lexington, KY, United States of America; Markey Cancer Center, Lexington, KY, United States of America. Electronic address: sally@kcr.uky.edu.
Brian Davis

Markey Cancer Center, Lexington, KY, United States of America.
Jonathan Allen

Lawrence Livermore National Laboratory, Livermore, CA, United States of America.

Keywords

Algorithms Big Data Computational Biology Drug Discovery Humans Ligands Machine Learning Protein Binding

External Resources

View on PubMed Access via DOI PubMed (32057823)

Machine learning and ligand binding predictions: A review of data, methods, and obstacles.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals