DeepDualEnhancer: A Dual-Feature Input DNABert Based Deep Learning Method for Enhancer Recognition.

Journal: International journal of molecular sciences
PMID:

Abstract

Enhancers are cis-regulatory DNA sequences that are widely distributed throughout the genome. They can precisely regulate the expression of target genes. Since the features of enhancer segments are difficult to detect, we propose DeepDualEnhancer, a DNABert-based method using a multi-scale convolutional neural network, BiLSTM, for enhancer identification. We first designed the DeepDualEnhancer method based only on the DNA sequence input. It mainly consists of a multi-scale Convolutional Neural Network, and BiLSTM to extract features by DNABert and embedding, respectively. Meanwhile, we collected new datasets from the enhancer-promoter interaction field and designed the method DeepDualEnhancer-genomic for inputting DNA sequences and genomic signals, which consists of the transformer sequence attention. Extensive comparisons of our method with 20 other excellent methods through 5-fold cross validation, ablation experiments, and an independent test demonstrated that DeepDualEnhancer achieves the best performance. It is also found that the inclusion of genomic signals helps the enhancer recognition task to be performed better.

Authors

  • Tao Song
    Department of Cleft Lip and Palate, Plastic Surgery Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing.
  • Haonan Song
    Qingdao Institute of Software, College of Computer Science and Technology, China University of Petroleum, Qingdao 266555, China.
  • Zhiyi Pan
    Qingdao Institute of Software, College of Computer Science and Technology, China University of Petroleum, Qingdao 266555, China.
  • Yuan Gao
    Engineering Research Center of EMR and Intelligent Expert System, Ministry of Education, Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, College of Biomedical Engineering and Instrument Science, Zhejiang University, Hangzhou Zhejiang Province, China.
  • Huanhuan Dai
    Qingdao Institute of Software, College of Computer Science and Technology, China University of Petroleum, Qingdao 266555, China.
  • Xun Wang
    College of Computer Science and Technology, China University of Petroleum, Dongying, China.