COSSMO: predicting competitive alternative splice site selection using deep learning.

Journal: Bioinformatics (Oxford, England)
Published Date:

Abstract

MOTIVATION: Alternative splice site selection is inherently competitive and the probability of a given splice site to be used also depends on the strength of neighboring sites. Here, we present a new model named the competitive splice site model (COSSMO), which explicitly accounts for these competitive effects and predicts the percent selected index (PSI) distribution over any number of putative splice sites. We model an alternative splicing event as the choice of a 3' acceptor site conditional on a fixed upstream 5' donor site or the choice of a 5' donor site conditional on a fixed 3' acceptor site. We build four different architectures that use convolutional layers, communication layers, long short-term memory and residual networks, respectively, to learn relevant motifs from sequence alone. We also construct a new dataset from genome annotations and RNA-Seq read data that we use to train our model.

Authors

  • Hannes Bretschneider
    Department of Electrical and Computer Engineering, University of Toronto, Toronto, Ontario M5S 3G4, Canada. Program on Genetic Networks and Program on Neural Computation & Adaptive Perception, Canadian Institute for Advanced Research, Toronto, Ontario M5G 1Z8, Canada. Department of Computer Science, University of Toronto, Toronto, Ontario M5S 3G4, Canada.
  • Shreshth Gandhi
    Deep Genomics Inc, Toronto, Canada.
  • Amit G Deshwar
    Deep Genomics Inc, Toronto, Canada.
  • Khalid Zuberi
    Deep Genomics Inc, Toronto, Canada.
  • Brendan J Frey
    Department of Electrical and Computer Engineering, University of Toronto, Toronto, Ontario M5S 3G4, Canada. Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario M5S 3E1, Canada. Program on Genetic Networks and Program on Neural Computation & Adaptive Perception, Canadian Institute for Advanced Research, Toronto, Ontario M5G 1Z8, Canada. Department of Computer Science, University of Toronto, Toronto, Ontario M5S 3G4, Canada. McLaughlin Centre, University of Toronto, Toronto, Ontario M5G 0A4, Canada. Department of Molecular Genetics, University of Toronto, Toronto, Ontario M5S 1A8, Canada. eScience Group, Microsoft Research, Redmond, WA 98052, USA. frey@psi.toronto.edu.