Geminivirus data warehouse: a database enriched with machine learning approaches.

Journal: BMC bioinformatics
PMID:

Abstract

BACKGROUND: The Geminiviridae family encompasses a group of single-stranded DNA viruses with twinned and quasi-isometric virions, which infect a wide range of dicotyledonous and monocotyledonous plants and are responsible for significant economic losses worldwide. Geminiviruses are divided into nine genera, according to their insect vector, host range, genome organization, and phylogeny reconstruction. Using rolling-circle amplification approaches along with high-throughput sequencing technologies, thousands of full-length geminivirus and satellite genome sequences were amplified and have become available in public databases. As a consequence, many important challenges have emerged, namely, how to classify, store, and analyze massive datasets as well as how to extract information or new knowledge. Data mining approaches, mainly supported by machine learning (ML) techniques, are a natural means for high-throughput data analysis in the context of genomics, transcriptomics, proteomics, and metabolomics.

Authors

  • Jose Cleydson F Silva
    Departamento de Informática, Universidade Federal de Viçosa, Viçosa, Brazil.
  • Thales F M Carvalho
    Departamento de Informática, Universidade Federal de Viçosa, Viçosa, Brazil.
  • Marcos F Basso
    National Institute of Science and Technology in Plant-Pest Interactions/BIOAGRO, Universidade Federal de Viçosa, Viçosa, Brazil.
  • Michihito Deguchi
    National Institute of Science and Technology in Plant-Pest Interactions/BIOAGRO, Universidade Federal de Viçosa, Viçosa, Brazil.
  • Welison A Pereira
    National Institute of Science and Technology in Plant-Pest Interactions/BIOAGRO, Universidade Federal de Viçosa, Viçosa, Brazil.
  • Roberto R Sobrinho
    National Institute of Science and Technology in Plant-Pest Interactions/BIOAGRO, Universidade Federal de Viçosa, Viçosa, Brazil.
  • Pedro M P Vidigal
    Núcleo de Biomoléculas, Universidade Federal de Viçosa, Viçosa, MG, Brazil.
  • Otávio J B Brustolini
    National Institute of Science and Technology in Plant-Pest Interactions/BIOAGRO, Universidade Federal de Viçosa, Viçosa, Brazil.
  • Fabyano F Silva
    Departamento de Zootecnia, Universidade Federal de Viçosa, Viçosa, Brazil.
  • Maximiller Dal-Bianco
    National Institute of Science and Technology in Plant-Pest Interactions/BIOAGRO, Universidade Federal de Viçosa, Viçosa, Brazil.
  • Renildes L F Fontes
    Departamento de Solos, Universidade Federal de Viçosa, Viçosa, Brazil.
  • Anésia A Santos
    National Institute of Science and Technology in Plant-Pest Interactions/BIOAGRO, Universidade Federal de Viçosa, Viçosa, Brazil.
  • Francisco Murilo Zerbini
    National Institute of Science and Technology in Plant-Pest Interactions/BIOAGRO, Universidade Federal de Viçosa, Viçosa, Brazil.
  • Fabio R Cerqueira
    Departamento de Informática, Universidade Federal de Viçosa, Viçosa, Brazil.
  • Elizabeth P B Fontes
    National Institute of Science and Technology in Plant-Pest Interactions/BIOAGRO, Universidade Federal de Viçosa, Viçosa, Brazil. bbfontes@ufv.br.