Modeling and Predicting the Activities of Trans-Acting Splicing Factors with Machine Learning.

Journal: Cell systems
Published Date:

Abstract

Alternative splicing (AS) is generally regulated by trans-splicing factors that specifically bind to cis-elements in pre-mRNAs. The human genome encodes ∼1,500 RNA binding proteins (RBPs) that potentially regulate AS, yet their functions remain largely unknown. To explore their potential activities, we fused the putative functional domains of RBPs to a sequence-specific RNA-binding domain and systemically analyzed how these engineered factors affect splicing. We discovered that ∼80% of low-complexity domains in endogenous RBPs displayed distinct context-dependent activities in regulating splicing, indicating that AS is under more extensive regulation than previously expected. We developed a machine learning approach to classify and predict the activities of RBPs based on their sequence compositions and further validated this model using endogenous RBPs and synthetic polypeptides. These results represent a systematic inspection, modeling, prediction, and validation of how RBP sequences affect their activities in controlling splicing, paving the way for de novo engineering of artificial splicing factors.

Authors

  • Miaowei Mao
    CAS Key Laboratory of Computational Biology, CAS-MPG Partner Institute for Computational Biology, Shanghai Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, University of Chinese Academy of Sciences, CAS Center for Excellence in Molecular Cell Science, Chinese Academy of Sciences, Shanghai 200031, China; Synthetic Biology and Biotechnology Laboratory, State Key Laboratory of Bioreactor Engineering, School of Pharmacy, East China University of Science and Technology, Shanghai 200237, China; Signal Transduction Laboratory, National Institute of Environmental Health Sciences, Research Triangle Park, NC 27709, USA.
  • Yue Hu
    Department of Biobank, China-Japan Union Hospital of Jilin University, Changchun, China.
  • Yun Yang
    Department of Chemistry, South University of Science and Technology, Shenzhen 518055, China.
  • Yajie Qian
    Synthetic Biology and Biotechnology Laboratory, State Key Laboratory of Bioreactor Engineering, School of Pharmacy, East China University of Science and Technology, Shanghai 200237, China.
  • Huanhuan Wei
  • Wei Fan
    Department of Epidemiology, School of Public Health, Soochow University, Suzhou 215123, China.
  • Yi Yang
    Department of Orthopedics, Orthopedic Research Institute, West China Hospital, Sichuan University, Chengdu, Sichuan, China.
  • Xiaoling Li
    Department of Infections,Beijing Hospital of Traditional Chinese Medicine, Affiliated to the Capital Medical University, No. 23, Back Road of the Art Gallery, Dongcheng District, Beijing 100010, China.
  • Zefeng Wang
    CAS Key Laboratory of Computational Biology, CAS-MPG Partner Institute for Computational Biology, Shanghai Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, University of Chinese Academy of Sciences, CAS Center for Excellence in Molecular Cell Science, Chinese Academy of Sciences, Shanghai 200031, China. Electronic address: wangzefeng@picb.an.cn.