BertSNR: an interpretable deep learning framework for single-nucleotide resolution identification of transcription factor binding sites based on DNA language model.

Journal: Bioinformatics (Oxford, England)
PMID:

Abstract

MOTIVATION: Transcription factors are pivotal in the regulation of gene expression, and accurate identification of transcription factor binding sites (TFBSs) at high resolution is crucial for understanding the mechanisms underlying gene regulation. The task of identifying TFBSs from DNA sequences is a significant challenge in the field of computational biology today. To address this challenge, a variety of computational approaches have been developed. However, these methods face limitations in their ability to achieve high-resolution identification and often lack interpretability.

Authors

  • Hanyu Luo
    Department of Cardiology of Lu'an People's Hospital, Lu'an Hospital of Anhui Medical University, Lu'an, China.
  • Li Tang
    School of Computer Science and Engineering, Central South University, Changsha, Hunan 410083, China.
  • Min Zeng
    Nephrology Department, Affiliated Hospital of Southern Medical University: Shenzhen Longhua New District People's Hospital, Shenzhen, China.
  • Rui Yin
    Department of Health Outcomes and Biomedical Informatics, University of Florida, Gainesville, College of Medicine, FL, USA. Electronic address: ruiyin@ufl.edu.
  • Pingjian Ding
    Center for Artificial Intelligence in Drug Discovery, School of Medicine, Case Western Reserve University, Cleveland, OH, United States.
  • Lingyun Luo
    School of Computer Sciences, University of South China, Hengyang 421001, China. Electronic address: luoly@usc.edu.cn.
  • Min Li
    Hubei Provincial Institute for Food Supervision and Test, Hubei Provincial Engineering and Technology Research Center for Food Quality and Safety Test, Wuhan 430075, China.