Correlated RNN Framework to Quickly Generate Molecules with Desired Properties for Energetic Materials in the Low Data Regime.

Journal: Journal of chemical information and modeling
PMID:

Abstract

Motivated by the challenging of deep learning on the low data regime and the urgent demand for intelligent design on highly energetic materials, we explore a correlated deep learning framework, which consists of three recurrent neural networks (RNNs) correlated by the transfer learning strategy, to efficiently generate new energetic molecules with a high detonation velocity in the case of very limited data available. To avoid the dependence on the external big data set, data augmentation by fragment shuffling of 303 energetic compounds is utilized to produce 500,000 molecules to pretrain RNN, through which the model can learn sufficient structure knowledge. Then the pretrained RNN is fine-tuned by focusing on the 303 energetic compounds to generate 7153 molecules similar to the energetic compounds. In order to more reliably screen the molecules with a high detonation velocity, the SMILE enumeration augmentation coupled with the pretrained knowledge is utilized to build an RNN-based prediction model, through which is boosted from 0.4446 to 0.9572. The comparable performance with the transfer learning strategy based on an existing big database (ChEMBL) to produce the energetic molecules and drug-like ones further supports the effectiveness and generality of our strategy in the low data regime. High-precision quantum mechanics calculations further confirm that 35 new molecules present a higher detonation velocity and lower synthetic accessibility than the classic explosive RDX, along with good thermal stability. In particular, three new molecules are comparable to caged CL-20 in the detonation velocity. All the source codes and the data set are freely available at https://github.com/wangchenghuidream/RNNMGM.

Authors

  • Chuan Li
    State Key Laboratory for Molecular Virology and Genetic Engineering, National Institute for Viral Disease Control and Prevention, Chinese Center for Disease Control and Prevention, Beijing 102206, China.
  • Chenghui Wang
    College of Computer Science, Sichuan University, Chengdu610064, China.
  • Ming Sun
    Department of Urology, Jiangmen Central Hospital, Affiliated Jiangmen Hospital of SUN YAT-SEN University, 23 Beijie Haibang Street, Jiangmen, 529030, China.
  • Yan Zeng
    Department of Urology, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China.
  • Yuan Yuan
    Department of Geriatrics, Beijing Jishuitan Hospital, Capital Medical University, Beijing, China.
  • Qiaolin Gou
    College of Chemistry, Sichuan University, Chengdu 610064, China.
  • Guangchuan Wang
    College of Computer Science, Sichuan University, Chengdu610064, China.
  • Yanzhi Guo
    College of Chemistry, Sichuan University, Chengdu 610064, PR China. Electronic address: yzguo@scu.edu.cn.
  • Xuemei Pu
    College of Chemistry, Sichuan University Chengdu 610064 People's Republic of China xmpuscu@scu.edu.cn +86 028 8541 2290.