RNA splicing. The human splicing code reveals new insights into the genetic determinants of disease.
Journal:
Science (New York, N.Y.)
PMID:
25525159
Abstract
To facilitate precision medicine and whole-genome annotation, we developed a machine-learning technique that scores how strongly genetic variants affect RNA splicing, whose alteration contributes to many diseases. Analysis of more than 650,000 intronic and exonic variants revealed widespread patterns of mutation-driven aberrant splicing. Intronic disease mutations that are more than 30 nucleotides from any splice site alter splicing nine times as often as common variants, and missense exonic disease mutations that have the least impact on protein function are five times as likely as others to alter splicing. We detected tens of thousands of disease-causing mutations, including those involved in cancers and spinal muscular atrophy. Examination of intronic and exonic variants found using whole-genome sequencing of individuals with autism revealed misspliced genes with neurodevelopmental phenotypes. Our approach provides evidence for causal variants and should enable new discoveries in precision medicine.
Authors
Keywords
Adaptor Proteins, Signal Transducing
Artificial Intelligence
Child Development Disorders, Pervasive
Colorectal Neoplasms, Hereditary Nonpolyposis
Computer Simulation
DNA
Exons
Genetic Code
Genetic Markers
Genetic Variation
Genome-Wide Association Study
Humans
Introns
Models, Genetic
Molecular Sequence Annotation
Muscular Atrophy, Spinal
Mutation, Missense
MutL Protein Homolog 1
Nuclear Proteins
Polymorphism, Single Nucleotide
Quantitative Trait Loci
RNA Splice Sites
RNA Splicing
RNA-Binding Proteins