iEnhancer-DCLA: using the original sequence to identify enhancers and their strength based on a deep learning framework.

Journal: BMC bioinformatics
PMID:

Abstract

Enhancers are small regions of DNA that bind to proteins, which enhance the transcription of genes. The enhancer may be located upstream or downstream of the gene. It is not necessarily close to the gene to be acted on, because the entanglement structure of chromatin allows the positions far apart in the sequence to have the opportunity to contact each other. Therefore, identifying enhancers and their strength is a complex and challenging task. In this article, a new prediction method based on deep learning is proposed to identify enhancers and enhancer strength, called iEnhancer-DCLA. Firstly, we use word2vec to convert k-mers into number vectors to construct an input matrix. Secondly, we use convolutional neural network and bidirectional long short-term memory network to extract sequence features, and finally use the attention mechanism to extract relatively important features. In the task of predicting enhancers and their strengths, this method has improved to a certain extent in most evaluation indexes. In summary, we believe that this method provides new ideas in the analysis of enhancers.

Authors

  • Meng Liao
    College of Mathematics and System Sciences, Xinjiang University, Ürümqi, China.
  • Jian-Ping Zhao
    College of Mathematics and System Sciences, Xinjiang University, Urumqi, China. Electronic address: zhaojianping@126.com.
  • Jing Tian
    School of Biological Engineering, Dalian Polytechnic University No. 1st Qinggongyuan, Ganjingzi Dalian 116034 P. R. China liqian19820903@163.com +86-411-86323725 +86-411-86323725.
  • Chun-Hou Zheng