1,003 reference genomes of bacterial and archaeal isolates expand coverage of the tree of life.

Journal: Nature biotechnology
PMID:

Abstract

We present 1,003 reference genomes that were sequenced as part of the Genomic Encyclopedia of Bacteria and Archaea (GEBA) initiative, selected to maximize sequence coverage of phylogenetic space. These genomes double the number of existing type strains and expand their overall phylogenetic diversity by 25%. Comparative analyses with previously available finished and draft genomes reveal a 10.5% increase in novel protein families as a function of phylogenetic diversity. The GEBA genomes recruit 25 million previously unassigned metagenomic proteins from 4,650 samples, improving their phylogenetic and functional interpretation. We identify numerous biosynthetic clusters and experimentally validate a divergent phenazine cluster with potential new chemical structure and antimicrobial activity. This Resource is the largest single release of reference genomes to date. Bacterial and archaeal isolate sequence space is still far from saturated, and future endeavors in this direction will continue to be a valuable resource for scientific discovery.

Authors

  • Supratim Mukherjee
    Department of Energy, Joint Genome Institute, Walnut Creek, California, USA.
  • Rekha Seshadri
    Department of Energy, Joint Genome Institute, Walnut Creek, California, USA.
  • Neha J Varghese
    Department of Energy, Joint Genome Institute, Walnut Creek, California, USA.
  • Emiley A Eloe-Fadrosh
    Department of Energy, Joint Genome Institute, Walnut Creek, California, USA.
  • Jan P Meier-Kolthoff
    Leibniz Institute DSMZ - German Collection of Microorganisms and Cell Cultures, Braunschweig, Germany.
  • Markus Göker
    Leibniz Institute DSMZ - German Collection of Microorganisms and Cell Cultures, Braunschweig, Germany.
  • R Cameron Coates
    Department of Energy, Joint Genome Institute, Walnut Creek, California, USA.
  • Michalis Hadjithomas
    Prokaryotic Super Program, DOE Joint Genome Institute, Walnut Creek, California, USA.
  • Georgios A Pavlopoulos
    Department of Energy, Joint Genome Institute, Walnut Creek, California, USA.
  • David Paez-Espino
    Department of Energy, Joint Genome Institute, Walnut Creek, California, USA.
  • Yasuo Yoshikuni
    Department of Energy, Joint Genome Institute, Walnut Creek, California, USA.
  • Axel Visel
    Department of Energy, Joint Genome Institute, Walnut Creek, California, USA.
  • William B Whitman
    Department of Microbiology, University of Georgia, Athens, Georgia, USA.
  • George M Garrity
    Department of Microbiology and Molecular Genetics, Michigan State University, East Lansing, Michigan, USA.
  • Jonathan A Eisen
    University of California Davis Genome Center, Davis, California, USA.
  • Philip Hugenholtz
    Australian Centre for Ecogenomics, The University of Queensland, Brisbane, Queensland, Australia.
  • Amrita Pati
    Prokaryotic Super Program, DOE Joint Genome Institute, Walnut Creek, California, USA apati@lbl.gov nckyrpides@lbl.gov.
  • Natalia N Ivanova
    Prokaryotic Super Program, DOE Joint Genome Institute, Walnut Creek, California, USA.
  • Tanja Woyke
    , 2800 Mitchell Drive, Walnut Creek, 94598, CA, USA.
  • Hans-Peter Klenk
    School of Biology, Newcastle University, Newcastle upon, Tyne, UK.
  • Nikos C Kyrpides
    Prokaryotic Super Program, DOE Joint Genome Institute, Walnut Creek, California, USA apati@lbl.gov nckyrpides@lbl.gov.