Deep learning analyses of splicing variants identify the link of PCP4 with amyotrophic lateral sclerosis.

Journal: Brain : a journal of neurology
Published Date:

Abstract

Amyotrophic lateral sclerosis (ALS) is a severe motor neuron disease, with most sporadic cases lacking clear genetic causes. Abnormal pre-mRNA splicing is a fundamental mechanism in neurodegenerative diseases. For example, TAR DNA-binding protein 43 (TDP-43) loss of function causes widespread RNA mis-splicing events in ALS. Additionally, splicing mutations are major contributors to neurological disorders. However, the role of intronic variants driving RNA mis-splicing in ALS remains poorly understood. To address this, we developed Spliformer to predict RNA splicing. Spliformer is a transformer-based deep learning model trained and tested on splicing events from the GENCODE database, in addition to RNA-sequencing data from blood and CNS tissues. We benchmarked Spliformer against SpliceAI and Pangolin using testing datasets and paired whole-genome sequencing with RNA-sequencing data. We also developed the Spliformer-motif model to identify splicing regulatory motifs. We analysed the Clinvar dataset to identify the link of splicing variants with disease pathogenicity. Additionally, we analysed whole-genome sequencing data of ALS patients and controls to identify common intronic splicing variants linked to ALS risk or disease phenotypes. We also profiled rare intronic splicing variants in ALS patients to identify known or novel ALS-associated genes. Minigene assays were used to validate candidate splicing variants. Finally, we measured spine density in neurons with a specific gene knockdown or those expressing a TDP-43 disease-causing mutant. Spliformer accurately predicts the possibilities of a nucleotide within a pre-mRNA sequence being a splice donor, acceptor or neither. Spliformer outperformed SpliceAI and Pangolin in both speed and accuracy in tested splicing events and/or paired whole-genome sequencing/RNA-sequencing data. Spliformer-motif successfully identified canonical and novel splicing regulatory motifs. In the Clinvar dataset, splicing variants are highly related to disease pathogenicity. Genome-wide analyses of common intronic splicing variants nominated one variant linked to ALS progression. Deep learning analyses of whole-genome sequencing data from 1370 ALS patients revealed rare splicing variants in reported ALS genes (such as PTPRN2 and CFAP410, validated through minigene assays and RNA sequencing) and TDP-43 loss-of-function-related RNA mis-splicing genes (such as PTPRD). Further genetic analysis and minigene assays nominated PCP4 and TMEM63A as ALS-associated genes. Functional assays demonstrated that PCP4 is crucial for maintaining spine density and can rescue spine loss in neurons expressing a disease-causing TDP-43 mutant. In summary, we developed Spliformer and Spliformer-motif, which accurately predict and interpret pre-mRNA splicing. Our findings highlight an intronic genetic mechanism driving RNA mis-splicing in ALS and nominate PCP4 as an ALS-associated gene.

Authors

  • Xuelin Tang
    State Key Laboratory of Cardiology and Medical Innovation Center, Shanghai East Hospital, Clinical Center for Brain and Spinal Cord Research, School of Medicine, Tongji University, Shanghai 200331, China.
  • Yan Chen
    Department of Respiratory and Critical Care Medicine, Shanghai Pulmonary Hospital, School of Medicine, Tongji University, Shanghai, China.
  • Yongfei Ren
    Interdisciplinary Research Center on Biology and Chemistry, Shanghai Institute of Organic Chemistry, Chinese Academy of Sciences, Shanghai 201210, China.
  • Wanli Yang
    Shenzhen Engineering Laboratory for Eco-efficient Recycled Materials, School of Environment and Energy, Peking University, Shenzhen Graduate School, University Town, Xili, Nanshan District, Shenzhen 518055, China.
  • Wendi Yu
    Department of Neurology, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, 200127, PR China.
  • Yu Zhou
    Department of Biospectroscopy, Leibniz-Institut für Analytische Wissenschaften - ISAS - e.V., Dortmund, Germany.
  • Jingyan Guo
    State Key Laboratory of Cardiology and Medical Innovation Center, Shanghai East Hospital, Clinical Center for Brain and Spinal Cord Research, School of Medicine, Tongji University, Shanghai 200331, China.
  • Jiali Hu
    Department of Clinical Pharmacy, Shanghai First People's Hospital, Shanghai Jiao Tong University Shanghai 201620, PR China.
  • Xi Chen
    Department of Critical care medicine, Shenzhen Hospital, Southern Medical University, Guangdong, Shenzhen, China.
  • Yuqi Gu
    State Key Laboratory of Cardiology and Medical Innovation Center, Shanghai East Hospital, Clinical Center for Brain and Spinal Cord Research, School of Medicine, Tongji University, Shanghai 200331, China.
  • Chuyi Wang
    Interdisciplinary Research Center on Biology and Chemistry, Shanghai Institute of Organic Chemistry, Chinese Academy of Sciences, Shanghai 201210, China.
  • Yi Dong
    Department of Neurology, Huashan Hospital, Fudan University, Shanghai, China.
  • Hong Yang
    Cyberspace Institute of Advanced Technology, Guangzhou University, Guangzhou, China.
  • Christine Sato
    Tanz Centre for Research in Neurodegenerative Diseases, University of Toronto, Toronto, Ontario M5T 0S8, Canada.
  • Ji He
  • Dongsheng Fan
  • Linya You
    Department of Human Anatomy and Histoembryology, School of Basic Medical Sciences, Fudan University, Shanghai 200032, China.
  • Lorne Zinman
  • Ekaterina Rogaeva
    Tanz Centre for Research in Neurodegenerative Diseases, University of Toronto, Medical Sciences Building, 1 King's College Cir Suite 2374, Toronto, ON, M5S 1A8, Canada.
  • Yelin Chen
    Interdisciplinary Research Center on Biology and Chemistry, Shanghai Institute of Organic Chemistry, Chinese Academy of Sciences, Shanghai 201210, China.
  • Ming Zhang
    Heilongjiang Key Laboratory for Laboratory Animals and Comparative Medicine, College of Veterinary Medicine, Harbin 150030, China.