Valid, Plausible, and Diverse Retrosynthesis Using Tied Two-Way Transformers with Latent Variables.

Journal: Journal of chemical information and modeling
Published Date:

Abstract

Retrosynthesis is an essential task in organic chemistry for identifying the synthesis pathways of newly discovered materials, and with the recent advances in deep learning, there have been growing attempts to solve the retrosynthesis problem through transformer models, which are the state-of-the-art in neural machine translation, by converting the problem into a machine translation problem. However, the pure transformer provides unsatisfactory results that lack grammatical validity, chemical plausibility, and diversity in reactant candidates. In this study, we develop tied two-way transformers with latent modeling to solve those problems using cycle consistency checks, parameter sharing, and multinomial latent variables. Experimental results obtained using public and in-house datasets demonstrate that the proposed model improves the retrosynthesis accuracy, grammatical error, and diversity, and qualitative evaluation results verify its ability to suggest valid and plausible results.

Authors

  • Eunji Kim
    Samsung Advanced Institute of Technology, Samsung Electronics Co., Ltd., 130 Samsung-ro, Yeongtong-gu, Suwon 16678, Republic of Korea.
  • Dongseon Lee
    Samsung Advanced Institute of Technology, Samsung Electronics Co., Ltd., 130 Samsung-ro, Yeongtong-gu, Suwon 16678, Republic of Korea.
  • Youngchun Kwon
    Samsung Advanced Institute of Technology, Samsung Electronics Co., Ltd., 130 Samsung-ro, Yeongtong-gu, Suwon 16678, Republic of Korea.
  • Min Sik Park
    Samsung Advanced Institute of Technology, Samsung Electronics Co., Ltd., 130 Samsung-ro, Yeongtong-gu, Suwon 16678, Republic of Korea.
  • Youn-Suk Choi
    Samsung Advanced Institute of Technology, Samsung Electronics Co., Ltd., 130 Samsung-ro, Yeongtong-gu, Suwon 16678, Republic of Korea.