RED-ML: a novel, effective RNA editing detection method based on machine learning.

Journal: GigaScience
Published Date:

Abstract

With the advancement of second generation sequencing techniques, our ability to detect and quantify RNA editing on a global scale has been vastly improved. As a result, RNA editing is now being studied under a growing number of biological conditions so that its biochemical mechanisms and functional roles can be further understood. However, a major barrier that prevents RNA editing from being a routine RNA-seq analysis, similar to gene expression and splicing analysis, for example, is the lack of user-friendly and effective computational tools. Based on years of experience of analyzing RNA editing using diverse RNA-seq datasets, we have developed a software tool, RED-ML: RNA Editing Detection based on Machine learning (pronounced as "red ML"). The input to RED-ML can be as simple as a single BAM file, while it can also take advantage of matched genomic variant information when available. The output not only contains detected RNA editing sites, but also a confidence score to facilitate downstream filtering. We have carefully designed validation experiments and performed extensive comparison and analysis to show the efficiency and effectiveness of RED-ML under different conditions, and it can accurately detect novel RNA editing sites without relying on curated RNA editing databases. We have also made this tool freely available via GitHub . We have developed a highly accurate, speedy and general-purpose tool for RNA editing detection using RNA-seq data. With the availability of RED-ML, it is now possible to conveniently make RNA editing a routine analysis of RNA-seq. We believe this can greatly benefit the RNA editing research community and has profound impact to accelerate our understanding of this intriguing posttranscriptional modification process.

Authors

  • Heng Xiong
    BGI-Shenzhen, Shenzhen 518083, China.
  • Dongbing Liu
    BGI-Shenzhen, Shenzhen 518083, China.
  • Qiye Li
    BGI-Shenzhen, Shenzhen 518083, China.
  • Mengyue Lei
    BGI-Shenzhen, Shenzhen 518083, China.
  • Liqin Xu
    BGI-Shenzhen, Shenzhen 518083, China.
  • Liang Wu
    Clinical and Research Center of AIDS, Beijing Ditan Hospital, Capital Medical University, Beijing, China.
  • Zongji Wang
    BGI-Shenzhen, Shenzhen 518083, China.
  • Shancheng Ren
    Department of Urology, Changhai Hospital, Second Military University, Shanghai, China.
  • Wangsheng Li
    BGI-Shenzhen, Shenzhen 518083, China.
  • Min Xia
    BGI-Shenzhen, Shenzhen 518083, China.
  • Lihua Lu
    BGI-Shenzhen, Shenzhen 518083, China.
  • Haorong Lu
    BGI-Shenzhen, Shenzhen 518083, China.
  • Yong Hou
    BGI-Shenzhen, Shenzhen 518083, China.
  • Shida Zhu
    BGI-Shenzhen, Shenzhen 518083, China.
  • Xin Liu
    Peking University Institute of Advanced Agricultural Sciences, Shandong Laboratory of Advanced Agricultural Sciences, Weifang, Shandong, China.
  • Yinghao Sun
    Department of Urology, Changhai Hospital, Second Military Medical University, Shanghai, China.
  • Jian Wang
    Veterinary Diagnostic Center, Shanghai Animal Disease Control Center, Shanghai, China.
  • Huanming Yang
    BGI-Shenzhen, Shenzhen 518083, China.
  • Kui Wu
    BGI-Shenzhen, Shenzhen 518083, China.
  • Xun Xu
    BGI-Shenzhen, Shenzhen 518083, China.
  • Leo J Lee
    Department of Electrical and Computer Engineering, University of Toronto, Toronto, Ontario M5S 3G4, Canada. Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario M5S 3E1, Canada. Program on Genetic Networks and Program on Neural Computation & Adaptive Perception, Canadian Institute for Advanced Research, Toronto, Ontario M5G 1Z8, Canada.