A deep learning framework to predict binding preference of RNA constituents on protein surface.

Journal: Nature communications
PMID:

Abstract

Protein-RNA interaction plays important roles in post-transcriptional regulation. However, the task of predicting these interactions given a protein structure is difficult. Here we show that, by leveraging a deep learning model NucleicNet, attributes such as binding preference of RNA backbone constituents and different bases can be predicted from local physicochemical characteristics of protein structure surface. On a diverse set of challenging RNA-binding proteins, including Fem-3-binding-factor 2, Argonaute 2 and Ribonuclease III, NucleicNet can accurately recover interaction modes discovered by structural biology experiments. Furthermore, we show that, without seeing any in vitro or in vivo assay data, NucleicNet can still achieve consistency with experiments, including RNAcompete, Immunoprecipitation Assay, and siRNA Knockdown Benchmark. NucleicNet can thus serve to provide quantitative fitness of RNA sequences for given binding pockets or to predict potential binding pockets and binding RNAs for previously unknown RNA binding proteins.

Authors

  • Jordy Homing Lam
    Computational Bioscience Research Center, Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Saudi Arabia.
  • Yu Li
    Department of Public Health, Shihezi University School of Medicine, 832000, China.
  • Lizhe Zhu
    Department of Chemistry, The Hong Kong University of Science and Technology, Hong Kong, China. zhulizhe@cuhk.edu.cn.
  • Ramzan Umarov
    Electrical and Mathematical Sciences and Engineering Division (CEMSE), Computational Bioscience Research Center (CBRC), Computer, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Saudi Arabia.
  • Hanlun Jiang
    Department of Biochemistry and Institute for Protein Design, University of Washington, Seattle, USA.
  • Amélie Héliou
    Laboratoire d' Informatique, Department of Computer Science, École Polytechnique, Palaiseau, France.
  • Fu Kit Sheong
    Department of Chemistry, The Hong Kong University of Science and Technology, Hong Kong, China.
  • Tianyun Liu
    Departments of Medicine, Genetics and Bioengineering, Stanford University, Stanford, CA, USA.
  • Yongkang Long
    Computational Bioscience Research Center, Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Saudi Arabia.
  • Yunfei Li
    Pharmaceutics Department, Institute of Medicinal Biotechnology, Chinese Academy of Medical Science and Peking Union Medical College, Beijing, 100050, PR China.
  • Liang Fang
    Department of Biology, Southern University of Science and Technology, 518055, Shenzhen, Guangdong, China.
  • Russ B Altman
    Departments of Medicine, Genetics and Bioengineering, Stanford University, Stanford, California, United States of America.
  • Wei Chen
    Department of Urology, Zigong Fourth People's Hospital, Sichuan, China.
  • Xuhui Huang
    Brainnetome Center and National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, 100190 Beijing, China; Research Center for Brain-inspired Intelligence, Institute of Automation, Chinese Academy of Sciences, 100190 Beijing, China. Electronic address: xuhui.huang@ia.ac.cn.
  • Xin Gao
    Department of Computer Science, New Jersey Institute of Technology, Newark, New Jersey, USA.