PlantRNA_Sniffer: A SVM-Based Workflow to Predict Long Intergenic Non-Coding RNAs in Plants.

Journal: Non-coding RNA
Published Date:

Abstract

Non-coding RNAs (ncRNAs) constitute an important set of transcripts produced in the cells of organisms. Among them, there is a large amount of a particular class of long ncRNAs that are difficult to predict, the so-called long intergenic ncRNAs (lincRNAs), which might play essential roles in gene regulation and other cellular processes. Despite the importance of these lincRNAs, there is still a lack of biological knowledge and, currently, the few computational methods considered are so specific that they cannot be successfully applied to other species different from those that they have been originally designed to. Prediction of lncRNAs have been performed with machine learning techniques. Particularly, for lincRNA prediction, supervised learning methods have been explored in recent literature. As far as we know, there are no methods nor workflows specially designed to predict lincRNAs in plants. In this context, this work proposes a workflow to predict lincRNAs on plants, considering a workflow that includes known bioinformatics tools together with machine learning techniques, here a support vector machine (SVM). We discuss two case studies that allowed to identify novel lincRNAs, in sugarcane ( spp.) and in maize (). From the results, we also could identify differentially-expressed lincRNAs in sugarcane and maize plants submitted to pathogenic and beneficial microorganisms.

Authors

  • Lucas Maciel Vieira
    Departamento de Ciência da Computação, Universidade de Brasília, Brasília-DF 70910-900, Brasil. maciel.lucas@outlook.com.
  • Clicia Grativol
    Laboratório de Química e Função de Proteínas e Peptídeos, Universidade Estadual do Norte Fluminense, Campos dos Goytacazes-RJ 28013-602, Brazil. cgrativol@uenf.br.
  • Flavia Thiebaut
    Instituto de Bioquímica Médica Leopoldo de Meis, Universidade Federal do Rio de Janeiro, Rio de Janeiro-RJ 21941-901, Brazil. flaviabqi@gmail.com.
  • Thais G Carvalho
    Instituto de Bioquímica Médica Leopoldo de Meis, Universidade Federal do Rio de Janeiro, Rio de Janeiro-RJ 21941-901, Brazil. thaislouise@hotmail.com.
  • Pablo R Hardoim
    Instituto de Bioquímica Médica Leopoldo de Meis, Universidade Federal do Rio de Janeiro, Rio de Janeiro-RJ 21941-901, Brazil. phardoim@gmail.com.
  • Adriana Hemerly
    Instituto de Bioquímica Médica Leopoldo de Meis, Universidade Federal do Rio de Janeiro, Rio de Janeiro-RJ 21941-901, Brazil. hemerly@bioqmed.ufrj.br.
  • Sergio Lifschitz
    Departamento de Informática, Pontifícia Universidade Católica do Rio de Janeiro, Rio de Janeiro-RJ 22451-900, Brazil. sergio@inf.puc-rio.br.
  • Paulo Cavalcanti Gomes Ferreira
    Instituto de Bioquímica Médica Leopoldo de Meis, Universidade Federal do Rio de Janeiro, Rio de Janeiro-RJ 21941-901, Brazil. paulof@bioqmed.ufrj.br.
  • Maria Emilia M T Walter
    Departamento de Ciência da Computação, Universidade de Brasília, Brasília-DF 70910-900, Brasil. mariaemilia@unb.br.

Keywords

No keywords available for this article.