Deep learning in template-free de novo biosynthetic pathway design of natural products.

Journal: Briefings in bioinformatics
PMID:

Abstract

Natural products (NPs) are indispensable in drug development, particularly in combating infections, cancer, and neurodegenerative diseases. However, their limited availability poses significant challenges. Template-free de novo biosynthetic pathway design provides a strategic solution for NP production, with deep learning standing out as a powerful tool in this domain. This review delves into state-of-the-art deep learning algorithms in NP biosynthesis pathway design. It provides an in-depth discussion of databases like Kyoto Encyclopedia of Genes and Genomes (KEGG), Reactome, and UniProt, which are essential for model training, along with chemical databases such as Reaxys, SciFinder, and PubChem for transfer learning to expand models' understanding of the broader chemical space. It evaluates the potential and challenges of sequence-to-sequence and graph-to-graph translation models for accurate single-step prediction. Additionally, it discusses search algorithms for multistep prediction and deep learning algorithms for predicting enzyme function. The review also highlights the pivotal role of deep learning in improving catalytic efficiency through enzyme engineering, which is essential for enhancing NP production. Moreover, it examines the application of large language models in pathway design, enzyme discovery, and enzyme engineering. Finally, it addresses the challenges and prospects associated with template-free approaches, offering insights into potential advancements in NP biosynthesis pathway design.

Authors

  • Xueying Xie
    State Key Laboratory of Bioelectronics, School of Biological Sciences and Medical Engineering, Southeast University, Nanjing 210096, China.
  • Lin Gui
    Shenzhen Engineering Laboratory of Performance Robots at Digital Stage, Harbin Institute of Technology Shenzhen Graduate School, Shenzhen, China and.
  • Baixue Qiao
    Key Laboratory of Saline-Alkali Vegetation Ecology Restoration, Ministry of Education (Northeast Forestry University), No. 26 Hexing Road, Xiangfang District, Harbin 150001, China.
  • Guohua Wang
    School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, China.
  • Shan Huang
    Tianjin Key Laboratory of Ionic-Molecular Function of Cardiovascular Disease, Department of Cardiology, Tianjin Institute of Cardiology, Second Hospital of Tianjin Medical University, 300211 Tianjin, China.
  • Yuming Zhao
  • Shanwen Sun
    University of Bayreuth in Germany. He is now a postdoctoral fellow at the Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China.