Deep learning models for bacteria taxonomic classification of metagenomic data.

Journal: BMC bioinformatics
Published Date:

Abstract

BACKGROUND: An open challenge in translational bioinformatics is the analysis of sequenced metagenomes from various environmental samples. Of course, several studies demonstrated the 16S ribosomal RNA could be considered as a barcode for bacteria classification at the genus level, but till now it is hard to identify the correct composition of metagenomic data from RNA-seq short-read data. 16S short-read data are generated using two next generation sequencing technologies, i.e. whole genome shotgun (WGS) and amplicon (AMP); typically, the former is filtered to obtain short-reads belonging to a 16S shotgun (SG), whereas the latter take into account only some specific 16S hypervariable regions. The above mentioned two sequencing technologies, SG and AMP, are used alternatively, for this reason in this work we propose a deep learning approach for taxonomic classification of metagenomic data, that can be employed for both of them.

Authors

  • Antonino Fiannaca
    National Research Council of Italy, ICAR-CNR, via Ugo La Malfa 153, Palermo, 90146, Italy. fiannaca@pa.icar.cnr.it.
  • Laura La Paglia
    Istituto di Calcolo e Reti ad Alte Prestazioni-Consiglio Nazionale delle Ricerche, Via Ugo La Malfa 153, 90146 Palermo, Italy.
  • Massimo La Rosa
    Institute of High-Performance Computing and Networking, National Research Council of Italy, Viale delle Scienze, Ed. 11, 90128 Palermo, Italy.
  • Giosue' Lo Bosco
    Dipartimento di Matematica e Informatica, Università degli studi di Palermo, Via Archirafi, 34, Palermo, Italy.
  • Giovanni Renda
    Dipartimento dell'Innovazione Industriale e Digitale, Università degli studi di Palermo, Viale Delle Scienze, ed.6, Palermo, Italy.
  • Riccardo Rizzo
    National Research Council of Italy, ICAR-CNR, via Ugo La Malfa 153, Palermo, 90146, Italy.
  • Salvatore Gaglio
    CNR-ICAR, National Research Council of Italy, Via Ugo La Malfa, 153, Palermo, Italy.
  • Alfonso Urso
    Istituto di Calcolo e Reti ad Alte Prestazioni-Consiglio Nazionale delle Ricerche, Via Ugo La Malfa 153, 90146 Palermo, Italy.