SpliceFinder: ab initio prediction of splice sites using convolutional neural network.

Journal: BMC bioinformatics
Published Date:

Abstract

BACKGROUND: Identifying splice sites is a necessary step to analyze the location and structure of genes. Two dinucleotides, GT and AG, are highly frequent on splice sites, and many other patterns are also on splice sites with important biological functions. Meanwhile, the dinucleotides occur frequently at the sequences without splice sites, which makes the prediction prone to generate false positives. Most existing tools select all the sequences with the two dimers and then focus on distinguishing the true splice sites from those pseudo ones. Such an approach will lead to a decrease in false positives; however, it will result in non-canonical splice sites missing.

Authors

  • Ruohan Wang
    Department of Computer Science, City University of Hong Kong, Hong Kong 99907, China.
  • Zishuai Wang
    Department of Computer Science, City University of Hong Kong, 83 Tat Chee Ave, Kowloon Tong, Hong Kong, China.
  • Jianping Wang
    Department of Computer Science, City University of Hong Kong, 83 Tat Chee Ave, Kowloon Tong, Hong Kong, China. jianwang@cityu.edu.hk.
  • Shuaicheng Li
    Department of Computer Science, City University of Hong Kong, Kowloon 999077, Hong Kong. shuaicli@gmail.com.