Molecular Generation with Reduced Labeling through Constraint Architecture.

Journal: Journal of chemical information and modeling
Published Date:

Abstract

In the past few years, a number of machine learning (ML)-based molecular generative models have been proposed for generating molecules with desirable properties, but they all require a large amount of label data of pharmacological and physicochemical properties. However, experimental determination of these labels, especially bioactivity labels, is very expensive. In this study, we analyze the dependence of various multi-property molecule generation models on biological activity label data and propose Frag-G/M, a fragment-based multi-constraint molecular generation framework based on conditional transformer, recurrent neural networks (RNNs), and reinforcement learning (RL). The experimental results illustrate that, using the same number of labels, Frag-G/M can generate more desired molecules than the baselines (several times more than the baselines). Moreover, compared with the known active compounds, the molecules generated by Frag-G/M exhibit higher scaffold diversity than those generated by the baselines, thus making it more promising to be used in real-world drug discovery scenarios.

Authors

  • Jike Wang
    School of Computer Science, Wuhan University, Wuhan, Hubei 430072, China.
  • Yundian Zeng
    College of Control Science and Engineering, Zhejiang University, Hangzhou, Zhejiang 310027, P. R. China.
  • Huiyong Sun
    College of Pharmaceutical Sciences, Zhejiang University , Hangzhou, Zhejiang 310058, China.
  • Junmei Wang
    Department of Pharmaceutical Sciences, Computational Chemical Genomics Screen Center, School of Pharmacy, University of Pittsburgh, 3501 Terrace St, Pittsburgh, PA, 15213, USA; Department of Pharmaceutical Sciences, School of Pharmacy, NIDA National Center of Excellence for Computational Drug Abuse Research, University of Pittsburgh, 3501 Terrace St, Pittsburgh, PA, 15213, USA. Electronic address: junmei.wang@pitt.edu.
  • Xiaorui Wang
    Structural Biophysics Group, School of Optometry and Vision Sciences, Cardiff University, Cardiff, Wales, UK.
  • Ruofan Jin
    College of Life Science, Zhejiang University, Hangzhou, Zhejiang 310027, P. R. China.
  • Mingyang Wang
    Department of Ultrasound, Tianjin First Central Hospital, NanKai University, Tianjin, 300192, China.
  • Xujun Zhang
    Injury Prevention Research Institute, Department of Epidemiology and Biostatistics, School of Public Health, Southeast University, Nanjing, Jiangsu Province, China.
  • Dongsheng Cao
    School of Pharmaceutical Sciences, Central South University, Changsha, China. oriental-cds@163.com.
  • Xi Chen
    Department of Critical care medicine, Shenzhen Hospital, Southern Medical University, Guangdong, Shenzhen, China.
  • Chang-Yu Hsieh
    Tencent Quantum Laboratory, Tencent, Shenzhen 518057 Guangdong, P. R. China.
  • Tingjun Hou
    College of Pharmaceutical Sciences, Zhejiang University , Hangzhou, Zhejiang 310058, China.