RNA-binding protein recognition based on multi-view deep feature and multi-label learning.

Journal: Briefings in bioinformatics
Published Date:

Abstract

RNA-binding protein (RBP) is a class of proteins that bind to and accompany RNAs in regulating biological processes. An RBP may have multiple target RNAs, and its aberrant expression can cause multiple diseases. Methods have been designed to predict whether a specific RBP can bind to an RNA and the position of the binding site using binary classification model. However, most of the existing methods do not take into account the binding similarity and correlation between different RBPs. While methods employing multiple labels and Long Short Term Memory Network (LSTM) are proposed to consider binding similarity between different RBPs, the accuracy remains low due to insufficient feature learning and multi-label learning on RNA sequences. In response to this challenge, the concept of RNA-RBP Binding Network (RRBN) is proposed in this paper to provide theoretical support for multi-label learning to identify RBPs that can bind to RNAs. It is experimentally shown that the RRBN information can significantly improve the prediction of unknown RNA-RBP interactions. To further improve the prediction accuracy, we present the novel computational method iDeepMV which integrates multi-view deep learning technology under the multi-label learning framework. iDeepMV first extracts data from the views of amino acid sequence and dipeptide component based on the RNA sequences as the original view. Deep neural network models are then designed for the respective views to perform deep feature learning. The extracted deep features are fed into multi-label classifiers which are trained with the RNA-RBP interaction information for the three views. Finally, a voting mechanism is designed to make comprehensive decision on the results of the multi-label classifiers. Our experimental results show that the prediction performance of iDeepMV, which combines multi-view deep feature learning models with RNA-RBP interaction information, is significantly better than that of the state-of-the-art methods. iDeepMV is freely available at http://www.csbio.sjtu.edu.cn/bioinf/iDeepMV for academic use. The code is freely available at http://github.com/uchihayht/iDeepMV.

Authors

  • Haitao Yang
    Jiangnan University.
  • Zhaohong Deng
    School of Digital Media, Jiangnan University, Wuxi, Jiangsu, China.
  • Xiaoyong Pan
    Department of Veterinary Clinical and Animal Sciences, University of Copenhagen, Copenhagen, Denmark. xypan172436@gmail.com.
  • Hong-Bin Shen
    Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, Shanghai, 200240, People's Republic of China. hbshen@sjtu.edu.cn.
  • Kup-Sze Choi
    Centre for Smart Heath, School of Nursing, Hong Kong Polytechnic University, Hong Kong, China.
  • Lei Wang
    Department of Nursing, Beijing Hospital, National Center of Gerontology, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing, China.
  • Shitong Wang
    School of Digital Media, Jiangnan University, Wuxi, Jiangsu, China.
  • Jing Wu
    School of Pharmaceutical Science, Jiangnan University, Wuxi, 214122, Jiangsu, China.