: categorical diffusion ensembles for single-step chemical retrosynthesis.

Journal: Journal of cheminformatics
Published Date:

Abstract

Methods for automatic chemical retrosynthesis have found recent success through the application of models traditionally built for natural language processing, primarily through transformer neural networks. These models have demonstrated significant ability to translate between the SMILES encodings of chemical products and reactants, but are constrained as a result of their autoregressive nature. We propose , an alternative template-free method for single-step retrosynthesis prediction in the form of categorical diffusion, which allows the entire output SMILES sequence to be predicted in unison. We construct an ensemble of diffusion models which achieves state-of-the-art performance for top-1 accuracy and competitive performance for top-3, top-5, and top-10 accuracy among template-free methods. We prove that is a strong baseline for a new class of template-free model and is capable of learning a variety of synthetic techniques used in laboratory settings.

Authors

  • Sean Current
    Department of Computer Science and Engineering, The Ohio State University.
  • Ziqi Chen
    Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science & Technology, Shanghai, 200237, China.
  • Daniel Adu-Ampratwum
    Division of Medicinal Chemistry and Pharmacognosy, College of Pharmacy, The Ohio State University, Columbus, Ohio 43210, United States.
  • Xia Ning
    Department of Biomedical Informatics, the Department of Computer Science and Engineering, and the Translational Data Analytics Institute, The Ohio State University, Columbus, OH, 43210.
  • Srinivasan Parthasarathy
    Department of Biomedical Informatics, the Department of Computer Science and Engineering, and the Translational Data Analytics Institute, The Ohio State University, Columbus, OH, 43210.

Keywords

No keywords available for this article.