PCirc: random forest-based plant circRNA identification software.
Journal:
BMC bioinformatics
Published Date:
Jan 6, 2021
Abstract
BACKGROUND: Circular RNA (circRNA) is a novel type of RNA with a closed-loop structure. Increasing numbers of circRNAs are being identified in plants and animals, and recent studies have shown that circRNAs play an important role in gene regulation. Therefore, identifying circRNAs from increasing amounts of RNA-seq data is very important. However, traditional circRNA recognition methods have limitations. In recent years, emerging machine learning techniques have provided a good approach for the identification of circRNAs in animals. However, using these features to identify plant circRNAs is infeasible because the characteristics of plant circRNA sequences are different from those of animal circRNAs. For example, plants are extremely rich in splicing signals and transposable elements, and their sequence conservation in rice, for example is far less than that in mammals. To solve these problems and better identify circRNAs in plants, it is urgent to develop circRNA recognition software using machine learning based on the characteristics of plant circRNAs.