Advancing microbial production through artificial intelligence-aided biology.

Journal: Biotechnology advances
PMID:

Abstract

Microbial cell factories (MCFs) have been leveraged to construct sustainable platforms for value-added compound production. To optimize metabolism and reach optimal productivity, synthetic biology has developed various genetic devices to engineer microbial systems by gene editing, high-throughput protein engineering, and dynamic regulation. However, current synthetic biology methodologies still rely heavily on manual design, laborious testing, and exhaustive analysis. The emerging interdisciplinary field of artificial intelligence (AI) and biology has become pivotal in addressing the remaining challenges. AI-aided microbial production harnesses the power of processing, learning, and predicting vast amounts of biological data within seconds, providing outputs with high probability. With well-trained AI models, the conventional Design-Build-Test (DBT) cycle has been transformed into a multidimensional Design-Build-Test-Learn-Predict (DBTLP) workflow, leading to significantly improved operational efficiency and reduced labor consumption. Here, we comprehensively review the main components and recent advances in AI-aided microbial production, focusing on genome annotation, AI-aided protein engineering, artificial functional protein design, and AI-enabled pathway prediction. Finally, we discuss the challenges of integrating novel AI techniques into biology and propose the potential of large language models (LLMs) in advancing microbial production.

Authors

  • Xinyu Gong
    School of Chemical, Materials, and Biomedical Engineering, College of Engineering, The University of Georgia, Athens, GA 30602, USA.
  • Jianli Zhang
    School of Chemical, Materials, and Biomedical Engineering, College of Engineering, The University of Georgia, Athens, GA 30602, USA.
  • Qi Gan
    School of Chemical, Materials, and Biomedical Engineering, College of Engineering, The University of Georgia, Athens, GA 30602, USA.
  • Yuxi Teng
    School of Chemical, Materials, and Biomedical Engineering, College of Engineering, The University of Georgia, Athens, GA 30602, USA.
  • Jixin Hou
    School of ECAM, College of Engineering, University of Georgia, Athens, GA 30602, USA.
  • Yanjun Lyu
    Department of Computer Science and Engineering, The University of Texas at Arlington, Arlington 76019, USA.
  • Zhengliang Liu
    School of Computing, University of Georgia, Athens, GA, United States.
  • Zihao Wu
    School of Computing, University of Georgia, Athens, GA, United States.
  • Runpeng Dai
    Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA.
  • Yusong Zou
    School of Chemical, Materials, and Biomedical Engineering, College of Engineering, The University of Georgia, Athens, GA 30602, USA.
  • Xianqiao Wang
    School of ECAM, College of Engineering, University of Georgia, Athens, GA 30602, USA.
  • Dajiang Zhu
    Department of Computer Science and Engineering, The University of Texas at Arlington, Arlington, TX, United States.
  • Hongtu Zhu
    Department of Biostatistics, University of North Carolina at Chapel Hill, USA. Electronic address: htzhu@email.unc.edu.
  • Tianming Liu
    School of Computing, University of Georgia, Athens, GA, United States.
  • Yajun Yan
    School of Chemical, Materials, and Biomedical Engineering, College of Engineering, The University of Georgia, Athens, GA 30602, USA. Electronic address: yajunyan@uga.edu.