GCBLANE: A graph-enhanced convolutional BiLSTM attention network for improved transcription factor binding site prediction
Journal:
arXiv
Published Date:
Mar 16, 2025
Abstract
Identifying transcription factor binding sites (TFBS) is crucial for
understanding gene regulation, as these sites enable transcription factors
(TFs) to bind to DNA and modulate gene expression. Despite advances in
high-throughput sequencing, accurately identifying TFBS remains challenging due
to the vast genomic data and complex binding patterns. GCBLANE, a
graph-enhanced convolutional bidirectional Long Short-Term Memory (LSTM)
attention network, is introduced to address this issue. It integrates
convolutional, multi-head attention, and recurrent layers with a graph neural
network to detect key features for TFBS prediction. On 690 ENCODE ChIP-Seq
datasets, GCBLANE achieved an average AUC of 0.943, and on 165 ENCODE datasets,
it reached an AUC of 0.9495, outperforming advanced models that utilize
multimodal approaches, including DNA shape information. This result underscores
GCBLANE's effectiveness compared to other methods. By combining graph-based
learning with sequence analysis, GCBLANE significantly advances TFBS
prediction.