Novel Consensus Architecture To Improve Performance of Large-Scale Multitask Deep Learning QSAR Models.

Journal: Journal of chemical information and modeling
Published Date:

Abstract

Advances in the development of high-throughput screening and automated chemistry have rapidly accelerated the production of chemical and biological data, much of them freely accessible through literature aggregator services such as ChEMBL and PubChem. Here, we explore how to use this comprehensive mapping of chemical biology space to support the development of large-scale quantitative structure-activity relationship (QSAR) models. We propose a new deep learning consensus architecture (DLCA) that combines consensus and multitask deep learning approaches together to generate large-scale QSAR models. This method improves knowledge transfer across different target/assays while also integrating contributions from models based on different descriptors. The proposed approach was validated and compared with proteochemometrics, multitask deep learning, and Random Forest methods paired with various descriptors types. DLCA models demonstrated improved prediction accuracy for both regression and classification tasks. The best models together with their modeling sets are provided through publicly available web services at https://predictor.ncats.io .

Authors

  • Alexey V Zakharov
    National Center for Advancing Translational Sciences (NCATS) , National Institutes of Health , 9800 Medical Center Drive , Rockville , Maryland 20850 , United States.
  • Tongan Zhao
    National Center for Advancing Translational Sciences (NCATS) , National Institutes of Health , 9800 Medical Center Drive , Rockville , Maryland 20850 , United States.
  • Dac-Trung Nguyen
    National Center for Advancing Translational Science, Rockville, MD, USA.
  • Tyler Peryea
    National Center for Advancing Translational Sciences (NCATS) , National Institutes of Health , 9800 Medical Center Drive , Rockville , Maryland 20850 , United States.
  • Timothy Sheils
    National Center for Advancing Translational Sciences (NCATS) , National Institutes of Health , 9800 Medical Center Drive , Rockville , Maryland 20850 , United States.
  • Adam Yasgar
    National Center for Advancing Translational Sciences (NCATS) , National Institutes of Health , 9800 Medical Center Drive , Rockville , Maryland 20850 , United States.
  • Ruili Huang
    National Center for Advancing Translational Sciences (NCATS) , National Institutes of Health , 9800 Medical Center Drive , Rockville , Maryland 20850 , United States.
  • Noel Southall
    National Center for Advancing Translational Sciences (NCATS) , National Institutes of Health , 9800 Medical Center Drive , Rockville , Maryland 20850 , United States.
  • Anton Simeonov
    National Center for Advancing Translational Sciences, National Institutes of Health, Rockville, MD, USA.