HybProm: An attention-assisted hybrid CNN-BiLSTM model for the interpretable prediction of DNA promoter.

Journal: Methods (San Diego, Calif.)
PMID:

Abstract

Promoter prediction is essential for analyzing gene structures, understanding regulatory networks, transcription mechanisms, and precisely controlling gene expression. Recently, computational and deep learning methods for promoter prediction have gained attention. However, there is still room to improve their accuracy. To address this, we propose the HybProm model, which uses DNA2Vec to transform DNA sequences into low-dimensional vectors, followed by a CNN-BiLSTM-Attention architecture to extract features and predict promoters across species, including E. coli, humans, mice, and plants. Experiments show that HybProm consistently achieves high accuracy (90%-99%) and offers good interpretability by identifying key sequence patterns and positions that drive predictions.

Authors

  • Rentao Luo
    College of Physics and Electronic Information, Gannan Normal University, Ganzhou 341000 Jiangxi, China.
  • Jiawei Liu
    School of Biomedical Engineering, The Sixth Affiliated Hospital, Guangzhou Medical University, Guangzhou, Guangdong 511436, China.
  • Lixin Guan
    College of Physics and Electronic Information, Gannan Normal University, Ganzhou, Jiangxi, 341000, China.
  • Mengshan Li
    College of Physics and Electronic Information, Gannan Normal University, Ganzhou, Jiangxi, 341000, China. jcimsli@163.com.