HelixADMET: a robust and endpoint extensible ADMET system incorporating self-supervised knowledge transfer.

Journal: Bioinformatics (Oxford, England)
Published Date:

Abstract

MOTIVATION: Accurate ADMET (an abbreviation for 'absorption, distribution, metabolism, excretion and toxicity') predictions can efficiently screen out undesirable drug candidates in the early stage of drug discovery. In recent years, multiple comprehensive ADMET systems that adopt advanced machine learning models have been developed, providing services to estimate multiple endpoints. However, those ADMET systems usually suffer from weak extrapolation ability. First, due to the lack of labelled data for each endpoint, typical machine learning models perform frail for the molecules with unobserved scaffolds. Second, most systems only provide fixed built-in endpoints and cannot be customized to satisfy various research requirements. To this end, we develop a robust and endpoint extensible ADMET system, HelixADMET (H-ADMET). H-ADMET incorporates the concept of self-supervised learning to produce a robust pre-trained model. The model is then fine-tuned with a multi-task and multi-stage framework to transfer knowledge between ADMET endpoints, auxiliary tasks and self-supervised tasks.

Authors

  • Shanzhuo Zhang
    Department of Natural Language Processcing, Baidu International Technology (Shenzhen) Co., Ltd, Shenzhen 518000, China.
  • Zhiyuan Yan
    State Key Laboratory of Robotics and System, Harbin Institute of Technology, Harbin 150001, China. yanzhiyuan@hit.edu.cn.
  • Yueyang Huang
    Department of Natural Language Processcing, Baidu International Technology (Shenzhen) Co., Ltd, Shenzhen 518000, China.
  • Lihang Liu
    Department of Natural Language Processcing, Baidu International Technology (Shenzhen) Co., Ltd, Shenzhen 518000, China.
  • Donglong He
    Department of Natural Language Processcing, Baidu International Technology (Shenzhen) Co., Ltd, Shenzhen 518000, China.
  • Wei Wang
    State Key Laboratory of Quality Research in Chinese Medicine, Institute of Chinese Medical Sciences, University of Macau, Macau 999078, China.
  • Xiaomin Fang
    Department of Natural Language Processcing, Baidu International Technology (Shenzhen) Co., Ltd, Shenzhen 518000, China.
  • Xiaonan Zhang
    Department of Natural Language Processcing, Baidu International Technology (Shenzhen) Co., Ltd, Shenzhen 518000, China.
  • Fan Wang
    Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming, China.
  • Hua Wu
    Research and Development Center of Biorational Pesticide, Northwest Agriculture & Forestry University, Yangling, Shaanxi 712100, PR China. Electronic address: huawu686@aliyun.com.
  • Haifeng Wang
    Collaborative Innovation Center of Seafood Deep Processing, Institute of Seafood, Zhejiang Gongshang University, Hangzhou, 310012, China.