Genomic Encyclopedia of Type Strains, Phase I: The one thousand microbial genomes (KMG-I) project

The Genomic Encyclopedia of Bacteria and Archaea (GEBA) project was launched by the JGI in 2007 as a pilot project with the objective of sequencing 250 bacterial and archaeal genomes. The two major goals of that project were (a) to test the hypothesis that there are many benefits to the use the phylogenetic diversity of organisms in the tree of life as a primary criterion for generating their genome sequence and (b) to develop the necessary framework, technology and organization for large-scale sequencing of microbial isolate genomes. While the GEBA pilot project has not yet been entirely completed, both of the original goals have already been successfully accomplished, leading the way for the next phase of the project.Here we propose taking the GEBA project to the next level, by generating high quality draft genomes for 1,000 bacterial and archaeal strains. This represents a combined 16-fold increase in both scale and speed as compared to the GEBA pilot project (250 isolate genomes in 4+ years). We will follow a similar approach for organism selection and sequencing prioritization as was done for the GEBA pilot project (i.e. phylogenetic novelty, availability and growth of cultures of type strains and DNA extraction capability), focusing on type strains as this ensures reproducibility of our results and provides the strongest linkage between genome sequences and other knowledge about each strain. In turn, this project will constitute a pilot phase of a larger effort that will target the genome sequences of all available type strains of the Bacteria and Archaea.

[1]  Natalia N. Ivanova,et al.  A phylogeny-driven genomic encyclopaedia of Bacteria and Archaea , 2009, Nature.

[2]  Emily S. Charlson,et al.  Minimum information about a marker gene sequence (MIMARKS) and minimum information about any (x) sequence (MIxS) specifications , 2011, Nature Biotechnology.

[3]  Frank Oliver Glöckner,et al.  Toward a standards-compliant genomic and metagenomic publication record. , 2008, Omics : a journal of integrative biology.

[4]  Natalia N. Ivanova,et al.  GenePRIMP: a gene prediction improvement pipeline for prokaryotic genomes , 2010, Nature Methods.

[5]  Natalia N. Ivanova,et al.  Genomics of Aerobic Cellulose Utilization Systems in Actinobacteria , 2012, PloS one.

[6]  B. Haas,et al.  A Catalog of Reference Genomes from the Human Microbiome , 2010, Science.

[7]  Brian J Tindall,et al.  Valid publication of names of prokaryotes according to the rules of nomenclature: past history and current practice. , 2006, International journal of systematic and evolutionary microbiology.

[8]  I-Min A. Chen,et al.  IMG ER: a system for microbial genome annotation expert review and curation , 2009, Bioinform..

[9]  H. Klenk,et al.  Phylogeny-driven target selection for large-scale genome-sequencing (and other) projects , 2013, Standards in genomic sciences.

[10]  G. Garrity The State of Standards in Genomic Sciences , 2011, Standards in genomic sciences.

[11]  N. Kyrpides Fifteen years of microbial genomics: meeting the challenges and fulfilling the dream , 2009, Nature Biotechnology.

[12]  J. Euzéby,et al.  International code of nomenclature of prokaryotes. Appendix 9: Orthography. , 2009, International journal of systematic and evolutionary microbiology.

[13]  Natalia N. Ivanova,et al.  The DOE-JGI Standard Operating Procedure for the Annotations of Microbial Genomes , 2009, Standards in genomic sciences.

[14]  G. Garrity,et al.  Proposals to clarify how type strains are deposited and made available to the scientific community for the purpose of systematic research. , 2008, International journal of systematic and evolutionary microbiology.

[15]  P. Hugenholtz Exploring prokaryotic diversity in the genomic era , 2002, Genome Biology.

[16]  G. Cochrane,et al.  The Genomic Standards Consortium , 2011, PLoS biology.

[17]  I-Min A. Chen,et al.  The Genomes On Line Database (GOLD) in 2007: status of genomic and metagenomic projects and their associated metadata , 2007, Nucleic Acids Res..

[18]  Jonathan A. Eisen,et al.  A phylogeny-driven genomic encyclopaedia of Bacteria and Archaea , 2009, Nature.

[19]  Natalia N. Ivanova,et al.  Novel Insights into the Diversity of Catabolic Metabolism from Ten Haloarchaeal Genomes , 2011, PloS one.

[20]  Natalia N. Ivanova,et al.  Identification of a haloalkaliphilic and thermostable cellulase with improved ionic liquid tolerance , 2011 .