Utilizing a deep learning model based on BERT for identifying enhancers and their strength.

Journal: PloS one

PMID: 40203028

Abstract

An enhancer is a specific DNA sequence typically located within a gene at upstream or downstream position and serves as a pivotal element in the regulation of eukaryotic gene transcription. Therefore, the recognition of enhancers is highly significant for comprehending gene expression regulatory systems. While some useful predictive models have been proposed, there are still deficiencies in these models. To address current limitations, we propose a model, DNABERT2-Enhancer, based on transformer architecture and deep learning, designed for the recognition of enhancers (classified as either enhancer or non-enhancer) and the identification of their activity (strong or weak enhancers). More specifically, DNABERT2-Enhancer is composed of a BERT model for extracting features and a CNN model for enhancers classification. Parameters of the BERT model are initialized by a pre-training DNABERT-2 language model. The enhancer recognition task is then fine-tuned through transfer learning to convert the original sequence into feature vectors. Subsequently, the CNN network is employed to learn the feature vector generated by BERT and produce the prediction results. In comparison with existing predictors utilizing the identical dataset, our approach demonstrates superior performance. This suggests that the model will be a useful instrument for academic research on the enhancer recognition.

Authors

Tong Wang

School of Public Health, Shanxi Medical University, Taiyuan 030000, China; Key Laboratory of Coal Environmental Pathogenicity and Prevention (Shanxi Medical University), Ministry of Education, Taiyuan 030000, China.
Mengqi Gao

School of Computer and Information Engineering, Shanghai Polytechnic University, Shanghai, China.

Keywords

Algorithms Computational Biology Deep Learning Enhancer Elements, Genetic Humans Neural Networks, Computer

External Resources

View on PubMed Access via DOI PubMed (40203028)

Utilizing a deep learning model based on BERT for identifying enhancers and their strength.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals