HydRA: Deep-learning models for predicting RNA-binding capacity from protein interaction association context and protein sequence.

Journal: Molecular cell
Published Date:

Abstract

RNA-binding proteins (RBPs) control RNA metabolism to orchestrate gene expression and, when dysfunctional, underlie human diseases. Proteome-wide discovery efforts predict thousands of RBP candidates, many of which lack canonical RNA-binding domains (RBDs). Here, we present a hybrid ensemble RBP classifier (HydRA), which leverages information from both intermolecular protein interactions and internal protein sequence patterns to predict RNA-binding capacity with unparalleled specificity and sensitivity using support vector machines (SVMs), convolutional neural networks (CNNs), and Transformer-based protein language models. Occlusion mapping by HydRA robustly detects known RBDs and predicts hundreds of uncharacterized RNA-binding associated domains. Enhanced CLIP (eCLIP) for HydRA-predicted RBP candidates reveals transcriptome-wide RNA targets and confirms RNA-binding activity for HydRA-predicted RNA-binding associated domains. HydRA accelerates construction of a comprehensive RBP catalog and expands the diversity of RNA-binding associated domains.

Authors

  • Wenhao Jin
    Department of Cellular and Molecular Medicine, University of Califorinia, San Diego, La Jolla, CA, USA; Institute for Genomic Medicine and UCSD Stem Cell Program, University of California, San Diego, La Jolla, CA, USA; Stem Cell Program, University of California, San Diego, La Jolla, CA, USA.
  • Kristopher W Brannan
    Department of Cellular and Molecular Medicine, University of Califorinia, San Diego, La Jolla, CA, USA; Institute for Genomic Medicine and UCSD Stem Cell Program, University of California, San Diego, La Jolla, CA, USA; Stem Cell Program, University of California, San Diego, La Jolla, CA, USA.
  • Katannya Kapeli
    Department of Physiology, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore.
  • Samuel S Park
    Department of Cellular and Molecular Medicine, University of Califorinia, San Diego, La Jolla, CA, USA; Institute for Genomic Medicine and UCSD Stem Cell Program, University of California, San Diego, La Jolla, CA, USA; Stem Cell Program, University of California, San Diego, La Jolla, CA, USA.
  • Hui Qing Tan
    Department of Physiology, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore.
  • Maya L Gosztyla
    Department of Cellular and Molecular Medicine, University of Califorinia, San Diego, La Jolla, CA, USA; Institute for Genomic Medicine and UCSD Stem Cell Program, University of California, San Diego, La Jolla, CA, USA; Stem Cell Program, University of California, San Diego, La Jolla, CA, USA.
  • Mayuresh Mujumdar
    Department of Cellular and Molecular Medicine, University of Califorinia, San Diego, La Jolla, CA, USA; Institute for Genomic Medicine and UCSD Stem Cell Program, University of California, San Diego, La Jolla, CA, USA; Stem Cell Program, University of California, San Diego, La Jolla, CA, USA.
  • Joshua Ahdout
    Department of Cellular and Molecular Medicine, University of Califorinia, San Diego, La Jolla, CA, USA; Institute for Genomic Medicine and UCSD Stem Cell Program, University of California, San Diego, La Jolla, CA, USA; Stem Cell Program, University of California, San Diego, La Jolla, CA, USA.
  • Bryce Henroid
    Department of Cellular and Molecular Medicine, University of Califorinia, San Diego, La Jolla, CA, USA; Institute for Genomic Medicine and UCSD Stem Cell Program, University of California, San Diego, La Jolla, CA, USA; Stem Cell Program, University of California, San Diego, La Jolla, CA, USA.
  • Katherine Rothamel
    Department of Cellular and Molecular Medicine, University of Califorinia, San Diego, La Jolla, CA, USA; Institute for Genomic Medicine and UCSD Stem Cell Program, University of California, San Diego, La Jolla, CA, USA; Stem Cell Program, University of California, San Diego, La Jolla, CA, USA.
  • Joy S Xiang
    Department of Cellular and Molecular Medicine, University of Califorinia, San Diego, La Jolla, CA, USA; Institute for Genomic Medicine and UCSD Stem Cell Program, University of California, San Diego, La Jolla, CA, USA; Stem Cell Program, University of California, San Diego, La Jolla, CA, USA.
  • Limsoon Wong
    Department of Computer Science, National University of Singapore, Singapore; Department of Pathology, National University of Singapore, Singapore.
  • Gene W Yeo
    Department of Cellular and Molecular Medicine, University of Califorinia, San Diego, La Jolla, CA, USA; Institute for Genomic Medicine and UCSD Stem Cell Program, University of California, San Diego, La Jolla, CA, USA; Stem Cell Program, University of California, San Diego, La Jolla, CA, USA. Electronic address: geneyeo@ucsd.edu.