Active Learning for Drug Design: A Case Study on the Plasma Exposure of Orally Administered Drugs.

Journal: Journal of medicinal chemistry
PMID:

Abstract

The success of artificial intelligence (AI) models has been limited by the requirement of large amounts of high-quality training data, which is just the opposite of the situation in most drug discovery pipelines. Active learning (AL) is a subfield of AI that focuses on algorithms that select the data they need to improve their models. Here, we propose a two-phase AL pipeline and apply it to the prediction of drug oral plasma exposure. In phase I, the AL-based model demonstrated a remarkable capability to sample informative data from a noisy data set, which used only 30% of the training data to yield a prediction capability with an accuracy of 0.856 on an independent test set. In phase II, the AL-based model explored a large diverse chemical space (855K samples) for experimental testing and feedback. Improved accuracy and new highly confident predictions (50K samples) were observed, which suggest that the model's applicability domain has been significantly expanded.

Authors

  • Xiaoyu Ding
    Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China.
  • Rongrong Cui
    School of Chinese Materia Medica, Nanjing University of Chinese Medicine, 138 Xianlin Road, Nanjing 210023, China.
  • Jie Yu
    Institute of Animal Nutrition, Sichuan Agricultural University, Key Laboratory for Animal Disease-Resistance Nutrition of China Ministry of Education, Key Laboratory of Animal Disease-resistant Nutrition and Feed of China Ministry of Agriculture and Rural Affairs, Key Laboratory of Animal Disease-resistant Nutrition of Sichuan Province, Ya'an, 625014, China.
  • Tiantian Liu
    Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China.
  • Tingfei Zhu
    Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China.
  • Dingyan Wang
    Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China.
  • Jie Chang
    School of Chinese Materia Medica, Nanjing University of Chinese Medicine, 138 Xianlin Road, Nanjing 210023, China.
  • Zisheng Fan
    School of Chinese Materia Medica, Nanjing University of Chinese Medicine, 138 Xianlin Road, Nanjing 210023, China.
  • Xiaomeng Liu
    Materials Science and Engineering Program, The University of Texas at Austin, Austin, TX 78712.
  • Kaixian Chen
    Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China.
  • Hualiang Jiang
    Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China ; School of Pharmacy, East China University of Science and Technology, Shanghai 200237, China.
  • Xutong Li
    Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China.
  • Xiaomin Luo
    Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China.
  • Mingyue Zheng
    School of Pharmaceutical Science and Technology, Hangzhou Institute for Advanced Study, UCAS, Hangzhou, Zhejiang Province, China.