Combining diffusion and transformer models for enhanced promoter synthesis and strength prediction in deep learning.

Journal: mSystems
PMID:

Abstract

UNLABELLED: In the field of synthetic biology, the engineering of synthetic promoters that outperform their natural counterparts is of paramount importance, which can optimize the expression of exogenous genes, enhance the efficiency of metabolic pathways, and possess substantial commercial value. Research indicates that some synthetic promoters have higher transcriptional activity compared to strong natural promoters. However, with the exponential increase in complexity due to the 4 potential combinations in a promoter sequence of length , identifying effective synthetic promoters remains a formidable challenge. Deep learning models, by adaptively learning from extensive data sets, have become instrumental in analyzing biological data. This study introduces a diffusion model-based approach for designing promoters viable in model bacteria such as and cyanobacteria. This model proficiently assimilates and utilizes inherent biological features from natural promoter sequences to engineer synthetic variants. Additionally, we employed a transformer model to evaluate the efficacy of these synthetic promoters, aiming at screening those with high performance. The experimental findings suggest that the synthetic promoters by the diffusion model not only share key biological features with their natural counterparts but also demonstrate greater similarity to natural promoters than those generated by a variational autoencoder. In predicting promoter strength, the transformer model demonstrated improved performance over the convolutional neural network. Finally, we developed an integrated platform for generating promoters and predicting their strength.

Authors

  • Xin Lei
    Department of Rheumatology and Immunology, People's Hospital of Xinjiang Uygur Autonomous Region, Urumqi, Xinjiang, China.
  • Xing Wang
    Department of Neurosis and Psychosomatic Diseases, Huzhou Third Municipal Hospital, The Affiliated Hospital of Huzhou University, Huzhou, Zhejiang, China.
  • Guanlin Chen
    School of Future Technology, South China University of Technology, Guangzhou, Guangdong, China.
  • Ce Liang
    Research Projects Department, Guangdong Artificial Intelligence and Digital Economy Laboratory (Guangzhou), Guangzhou, Guangdong, China.
  • Quhuan Li
    School of Bioscience and Bioengineering, South China University of Technology, Guangzhou, China.
  • Huaiguang Jiang
    School of Future Technology, South China University of Technology, Guangzhou, Guangdong, China.
  • Wei Xiong
    Department of Nutrition and Health, China Agricultural University, Beijing 100193, China; Food Laboratory of Zhongyuan, Luohe, Henan 462300, China. Electronic address: xiongwei910702@126.com.