Exploring deep learning in phage discovery and characterization.

Journal: Virology
Published Date:

Abstract

Bacteriophages, or bacterial viruses, play diverse ecological roles by shaping bacterial populations and also hold significant biotechnological and medical potential, including the treatment of infections caused by multidrug-resistant bacteria. The discovery of novel bacteriophages using large-scale metagenomic data has been accelerated by the accessibility of deep learning (Artificial Intelligence), the increased computing power of graphical processing units (GPUs), and new bioinformatics tools. This review addresses the recent revolution in bacteriophage research, ranging from the adoption of neural network algorithms applied to metagenomic data to the use of pre-trained language models, such as BERT, which have improved the reconstruction of viral metagenome-assembled genomes (vMAGs). This article also discusses the main aspects of bacteriophage biology using deep learning, highlighting the advances and limitations of this approach. Finally, prospects of deep-learning-based metagenomic algorithms and recommendations for future investigations are described.

Authors

  • Monyque Karoline de Paula Silva
    Ilum School of Science, Brazilian Center for Research in Energy and Materials (CNPEM), Campinas, São Paulo, Brazil. Electronic address: monyque.karoline@gmail.com.
  • Vitória Yumi Uetuki Nicoleti
    Ilum School of Science, Brazilian Center for Research in Energy and Materials (CNPEM), Campinas, São Paulo, Brazil. Electronic address: vitoriayumiuetuki@gmail.com.
  • Barbara da Paixão Perez Rodrigues
    Ilum School of Science, Brazilian Center for Research in Energy and Materials (CNPEM), Campinas, São Paulo, Brazil. Electronic address: barbaraperezrodrigues@gmail.com.
  • Ademir Sergio Ferreira Araujo
    Federal Univer sity of Piauí, Teresina, Piauí, Brazil. Electronic address: asfaruaj@yahoo.com.br.
  • Joel Henrique Ellwanger
    Laboratory of Immunobiology and Immunogenetics, Department of Genetics, Universidade Federal do Rio Grande do Sul (UFRGS), Porto Alegre, Rio Grande do Sul, Brazil. Electronic address: joel.ellwanger@gmail.com.
  • James Moraes de Almeida
    Ilum School of Science, Brazilian Center for Research in Energy and Materials (CNPEM), Campinas, São Paulo, Brazil. Electronic address: james.almeida@ilum.cnpem.br.
  • Leandro Nascimento Lemos
    Ilum School of Science, Brazilian Center for Research in Energy and Materials (CNPEM), Campinas, São Paulo, Brazil. Electronic address: lemosbioinfo@gmail.com.