Deep learning driven biosynthetic pathways navigation for natural products with BioNavi-NP.

Journal: Nature communications
PMID:

Abstract

The complete biosynthetic pathways are unknown for most natural products (NPs), it is thus valuable to make computer-aided bio-retrosynthesis predictions. Here, a navigable and user-friendly toolkit, BioNavi-NP, is developed to predict the biosynthetic pathways for both NPs and NP-like compounds. First, a single-step bio-retrosynthesis prediction model is trained using both general organic and biosynthetic reactions through end-to-end transformer neural networks. Based on this model, plausible biosynthetic pathways can be efficiently sampled through an AND-OR tree-based planning algorithm from iterative multi-step bio-retrosynthetic routes. Extensive evaluations reveal that BioNavi-NP can identify biosynthetic pathways for 90.2% of 368 test compounds and recover the reported building blocks as in the test set for 72.8%, 1.7 times more accurate than existing conventional rule-based approaches. The model is further shown to identify biologically plausible pathways for complex NPs collected from the recent literature. The toolkit as well as the curated datasets and learned models are freely available to facilitate the elucidation and reconstruction of the biosynthetic pathways for NPs.

Authors

  • Shuangjia Zheng
    Research Center for Drug Discovery, School of Pharmaceutical Sciences , Sun Yat-sen University , 132 East Circle at University City , Guangzhou 510006 , China.
  • Tao Zeng
    Department of Urology, Second Affiliated Hospital of Nanchang University, Nanchang, China.
  • Chengtao Li
    School of Environmental Science and Engineering, Shaanxi University of Science and Technology, Xi'an 170021, China.
  • Binghong Chen
    College of Computing, Georgia Institute of Technology, Atlanta, GA, USA.
  • Connor W Coley
    Department of Chemical Engineering, Massachusetts Institute of Technology 77 Massachusetts Avenue Cambridge MA 02139 USA whgreen@mit.edu kfjensen@mit.edu.
  • Yuedong Yang
    Institute for Glycomics and School of Information and Communication Technique, Griffith University, Parklands Dr. Southport, QLD 4222, Australia.
  • Ruibo Wu
    School of Pharmaceutical Sciences, Sun Yat-sen University, Guangzhou, 510006, China. wurb3@sysu.edu.cn.