G2GT: Retrosynthesis Prediction with Graph-to-Graph Attention Neural Network and Self-Training.

Journal: Journal of chemical information and modeling

Published Date: Mar 22, 2023

Abstract

Retrosynthesis prediction, the task of identifying reactant molecules that can be used to synthesize product molecules, is a fundamental challenge in organic chemistry and related fields. To address this challenge, we propose a novel graph-to-graph transformation model, G2GT. The model is built on the standard transformer structure and utilizes graph encoders and decoders. Additionally, we demonstrate the effectiveness of self-training, a data augmentation technique that utilizes unlabeled molecular data, in improving the performance of the model. To further enhance diversity, we propose a weak ensemble method, inspired by reaction-type labels and ensemble learning. This method incorporates beam search, nucleus sampling, and top- sampling to improve inference diversity. A simple ranking algorithm is employed to retrieve the final top-10 results. We achieved new state-of-the-art results on both the USPTO-50K data set, with a top-1 accuracy of 54%, and the larger more challenging USPTO-Full data set, with a top-1 accuracy of 49.3% and competitive top-10 results. Our model can also be generalized to all other graph-to-graph transformation tasks. Data and code are available at https://github.com/Anonnoname/G2GT_2.

Authors

Zaiyun Lin

Stone Wise, Room 918, Eighth Floor, Building 1, No. 6 Danling Street, Haidian District, Beijing, China 100089.
Shiqiu Yin

Stonewise, No. 19 Zhongguancun Street, Haidian District, 100080 Beijing, P. R. China.
Lei Shi
Wenbiao Zhou

Beijing StoneWise Technology Co Ltd., Haidian Street #15, Haidian District, Beijing 100080, China.
Yingsheng John Zhang

Stone Wise, Room 918, Eighth Floor, Building 1, No. 6 Danling Street, Haidian District, Beijing, China 100089.

Keywords

Algorithms Electric Power Supplies Learning Neural Networks, Computer

External Resources

View on PubMed Access via DOI PubMed (36946514)

G2GT: Retrosynthesis Prediction with Graph-to-Graph Attention Neural Network and Self-Training.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals