MAResNet: predicting transcription factor binding sites by combining multi-scale bottom-up and top-down attention and residual network.

Journal: Briefings in bioinformatics
Published Date:

Abstract

Accurate identification of transcription factor binding sites is of great significance in understanding gene expression, biological development and drug design. Although a variety of methods based on deep-learning models and large-scale data have been developed to predict transcription factor binding sites in DNA sequences, there is room for further improvement in prediction performance. In addition, effective interpretation of deep-learning models is greatly desirable. Here we present MAResNet, a new deep-learning method, for predicting transcription factor binding sites on 690 ChIP-seq datasets. More specifically, MAResNet combines the bottom-up and top-down attention mechanisms and a state-of-the-art feed-forward network (ResNet), which is constructed by stacking attention modules that generate attention-aware features. In particular, the multi-scale attention mechanism is utilized at the first stage to extract rich and representative sequence features. We further discuss the attention-aware features learned from different attention modules in accordance with the changes as the layers go deeper. The features learned by MAResNet are also visualized through the TMAP tool to illustrate that the method can extract the unique characteristics of transcription factor binding sites. The performance of MAResNet is extensively tested on 690 test subsets with an average AUC of 0.927, which is higher than that of the current state-of-the-art methods. Overall, this study provides a new and useful framework for the prediction of transcription factor binding sites by combining the funnel attention modules with the residual network.

Authors

  • Ke Han
    School of Computer and Information Engineering, Harbin University of Commerce, Harbin 150040, China. hanke@hrbcu.edu.cn.
  • Long-Chen Shen
    School of Computer Science and Engineering, Nanjing University of Science and Technology, China.
  • Yi-Heng Zhu
    School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing 210094, China.
  • Jian Xu
    Department of Cardiology, Lishui Central Hospital and the Fifth Affiliated Hospital of Wenzhou Medical University, Lishui, China.
  • Jiangning Song
    College of Information Engineering, Northwest A&F University, Yangling 712100, China, Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC 3800, Australia, National Engineering Laboratory for Industrial Enzymes and Key Laboratory of Systems Microbial Biotechnology, Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin, China, Centre for Research in Intelligent Systems, Faculty of Information Technology, Monash University, Melbourne, VIC 3800, Australia and ARC Centre of Excellence in Advanced Molecular Imaging, Monash University, Melbourne, VIC 3800, Australia College of Information Engineering, Northwest A&F University, Yangling 712100, China, Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC 3800, Australia, National Engineering Laboratory for Industrial Enzymes and Key Laboratory of Systems Microbial Biotechnology, Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin, China, Centre for Research in Intelligent Systems, Faculty of Information Technology, Monash University, Melbourne, VIC 3800, Australia and ARC Centre of Excellence in Advanced Molecular Imaging, Monash University, Melbourne, VIC 3800, Australia College of Information Engineering, Northwest A&F University, Yangling 712100, China, Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC 3800, Australia, National Engineering Laboratory for Industrial Enzymes and Key Laboratory of Systems Microbial Biotechnology, Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin, China, Centre for Research in Intelligent Systems, Faculty of Information Technology, Monash University, Melbourne, VIC 3800, Australia and ARC Centre of Excellence in Advanced Molecular Imaging, Monash University, Melbourne, VIC 3800, Australia.
  • Dong-Jun Yu