DeePaC: predicting pathogenic potential of novel DNA with reverse-complement neural networks.

Journal: Bioinformatics (Oxford, England)
Published Date:

Abstract

MOTIVATION: We expect novel pathogens to arise due to their fast-paced evolution, and new species to be discovered thanks to advances in DNA sequencing and metagenomics. Moreover, recent developments in synthetic biology raise concerns that some strains of bacteria could be modified for malicious purposes. Traditional approaches to open-view pathogen detection depend on databases of known organisms, which limits their performance on unknown, unrecognized and unmapped sequences. In contrast, machine learning methods can infer pathogenic phenotypes from single NGS reads, even though the biological context is unavailable.

Authors

  • Jakub M Bartoszewicz
    Bioinformatics Unit (MF1), Department of Methodology and Research Infrastructure, Robert Koch Institute, 13353 Berlin, Germany.
  • Anja Seidel
    Bioinformatics Unit (MF1), Department of Methodology and Research Infrastructure, Robert Koch Institute, 13353 Berlin, Germany.
  • Robert Rentzsch
    Research Group Bioinformatics (NG4), Robert Koch Institute, 13353, Berlin, Germany.
  • Bernhard Y Renard
    Research Group Bioinformatics (NG4), Robert Koch Institute, 13353, Berlin, Germany.