CamurWeb: a classification software and a large knowledge base for gene expression data of cancer.

Journal: BMC bioinformatics
Published Date:

Abstract

BACKGROUND: The high growth of Next Generation Sequencing data currently demands new knowledge extraction methods. In particular, the RNA sequencing gene expression experimental technique stands out for case-control studies on cancer, which can be addressed with supervised machine learning techniques able to extract human interpretable models composed of genes, and their relation to the investigated disease. State of the art rule-based classifiers are designed to extract a single classification model, possibly composed of few relevant genes. Conversely, we aim to create a large knowledge base composed of many rule-based models, and thus determine which genes could be potentially involved in the analyzed tumor. This comprehensive and open access knowledge base is required to disseminate novel insights about cancer.

Authors

  • Emanuel Weitschek
    Department of Engineering, Uninettuno International University, Corso Vittorio Emanuele II 39, Rome, 00186, Italy. emanuel@iasi.cnr.it.
  • Silvia Di Lauro
    Institute of Systems Analysis and Computer Science "A. Ruberti", National Research Council, Via dei Taurini 19, Rome, 00185, Italy.
  • Eleonora Cappelli
    Department of Engineering, Roma Tre University, Via della Vasca Navale 79, Rome, 00146, Italy.
  • Paola Bertolazzi
    Institute of Systems Analysis and Computer Science "A. Ruberti", National Research Council, Via dei Taurini 19, Rome, 00185, Italy.
  • Giovanni Felici
    Institute of Systems Analysis and Computer Science "A. Ruberti", National Research Council, Via dei Taurini 19, Rome, 00185, Italy.