Deepdefense: annotation of immune systems in prokaryotes using deep learning.

Journal: GigaScience
PMID:

Abstract

BACKGROUND: Due to a constant evolutionary arms race, archaea and bacteria have evolved an abundance and diversity of immune responses to protect themselves against phages. Since the discovery and application of CRISPR-Cas adaptive immune systems, numerous novel candidates for immune systems have been identified. Previous approaches to identifying these new immune systems rely on hidden Markov model (HMM)-based homolog searches or use labor-intensive and costly wet-lab experiments. To aid in finding and classifying immune systems genomes, we use machine learning to classify already known immune system proteins and discover potential candidates in the genome. Neural networks have shown promising results in classifying and predicting protein functionality in recent years. However, these methods often operate under the closed-world assumption, where it is presumed that all potential outcomes or classes are already known and included in the training dataset. This assumption does not always hold true in real-world scenarios, such as in genomics, where new samples can emerge that were not previously accounted for in the training phase.

Authors

  • Sven Hauns
    Universität Freiburg, 79098 Freiburg, Germany.
  • Omer S Alkhnbashi
    Chair of Bioinformatics, University of Freiburg, Freiburg, Germany.
  • Rolf Backofen
    Bioinformatics Group, Department of Computer Science, Albert-Ludwigs-University Freiburg, Georges-Köhler-Allee 106, Freiburg, 79110, Germany.