Protein Family-Specific Models Using Deep Neural Networks and Transfer Learning Improve Virtual Screening and Highlight the Need for More Data.

Journal: Journal of chemical information and modeling

Published Date: Oct 16, 2018

Abstract

Machine learning has shown enormous potential for computer-aided drug discovery. Here we show how modern convolutional neural networks (CNNs) can be applied to structure-based virtual screening. We have coupled our densely connected CNN (DenseNet) with a transfer learning approach which we use to produce an ensemble of protein family-specific models. We conduct an in-depth empirical study and provide the first guidelines on the minimum requirements for adopting a protein family-specific model. Our method also highlights the need for additional data, even in data-rich protein families. Our approach outperforms recent benchmarks on the DUD-E data set and an independent test set constructed from the ChEMBL database. Using a clustered cross-validation on DUD-E, we achieve an average AUC ROC of 0.92 and a 0.5% ROC enrichment factor of 79. This represents an improvement in early enrichment of over 75% compared to a recent machine learning benchmark. Our results demonstrate that the continued improvements in machine learning architecture for computer vision apply to structure-based virtual screening.

Authors

Fergus Imrie

Oxford Protein Informatics Group, Department of Statistics , University of Oxford , Oxford OX1 3LB , U.K.
Anthony R Bradley

Structural Genomics Consortium , University of Oxford , Oxford OX3 7DQ , U.K.
Mihaela van der Schaar

University of California, Los Angeles, CA, USA.
Charlotte M Deane

Oxford Protein Informatics Group, Department of Statistics, University of Oxford, Oxford, United Kingdom.

Keywords

Computer-Aided Design Databases, Pharmaceutical Databases, Protein Drug Discovery Humans Ligands Machine Learning Molecular Docking Simulation Neural Networks, Computer Proteins

External Resources

View on PubMed Access via DOI PubMed (30273487)

Protein Family-Specific Models Using Deep Neural Networks and Transfer Learning Improve Virtual Screening and Highlight the Need for More Data.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals