Reproducing the invention of a named reaction: zero-shot prediction of unseen chemical reactions.

Journal: Physical chemistry chemical physics : PCCP
Published Date:

Abstract

While state-of-art models can predict reactions through the transfer learning of thousands of samples with the same reaction types as those of the reactions to predict, how to prepare such models to predict "unseen" reactions remains an unanswered question. We aimed to study the Transformer model's ability to predict "unseen" reactions through "zero-shot reaction prediction (ZSRP)", a concept derived from zero-shot learning and zero-shot translation. We reproduced the human invention of the Chan-Lam coupling reaction where the inventor was inspired by the Suzuki reaction when improving Barton's bismuth arylation reaction. After being fine-tuned with samples from these two "existing" reactions, the USPTO-trained Transformer could predict "unseen" Chan-Lam coupling reactions with 55.7% top-1 accuracy. Our model could also mimic the later stage of the history of this reaction, where the initial case of this reaction was generalized to more reactants and reagents "one-shot/few-shot reaction prediction (OSRP/FSRP)" approaches.

Authors

  • An Su
    Department of Materials Design and Innovation, University at Buffalo, Buffalo, New York 14260-1660, United States.
  • Xinqiao Wang
    Artificial Intelligence Aided Drug Discovery Institute, College of Pharmaceutical Sciences, Zhejiang University of Technology, Hangzhou 310014, P. R. China. hduan@zjut.edu.cn.
  • Ling Wang
    The State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, #7 Jinsui Road, Guangzhou, Guangdong 510230, China.
  • Chengyun Zhang
    Artificial Intelligent Aided Drug Discovery Lab, College of Pharmaceutical Science, Zhejiang University of Technology, Hangzhou 310014, China.
  • Yejian Wu
    Artificial Intelligence Aided Drug Discovery Institute, College of Pharmaceutical Sciences, Zhejiang University of Technology, Hangzhou 310014, China. hduan@zjut.edu.cn.
  • Xinyi Wu
    Department of Immunology, School of Basic Medical Sciences, Anhui Medical University, Hefei, 230032, PR China.
  • Qingjie Zhao
    CAS Key Laboratory for Receptor Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China.
  • Hongliang Duan
    Artificial Intelligent Aided Drug Discovery Lab, College of Pharmaceutical Science, Zhejiang University of Technology, Hangzhou 310014, China.