proGenomes: a resource for consistent functional and taxonomic annotations of prokaryotic genomes

The availability of microbial genomes has opened many new avenues of research within microbiology. This has been driven primarily by comparative genomics approaches, which rely on accurate and consistent characterization of genomic sequences. It is nevertheless difficult to obtain consistent taxonomic and integrated functional annotations for defined prokaryotic clades. Thus, we developed proGenomes, a resource that provides user-friendly access to currently 25 038 high-quality genomes whose sequences and consistent annotations can be retrieved individually or by taxonomic clade. These genomes are assigned to 5306 consistent and accurate taxonomic species clusters based on previously established methodology. proGenomes also contains functional information for almost 80 million protein-coding genes, including a comprehensive set of general annotations and more focused annotations for carbohydrate-active enzymes and antibiotic resistance genes. Additionally, broad habitat information is provided for many genomes. All genomes and associated information can be downloaded by user-selected clade or multiple habitat-specific sets of representative genomes. We expect that the availability of high-quality genomes with comprehensive functional annotations will promote advances in clinical microbial genomics, functional evolution and other subfields of microbiology. proGenomes is available at http://progenomes.embl.de.

[1]  Peer Bork,et al.  Interactive tree of life (iTOL) v3: an online tool for the display and annotation of phylogenetic and other trees , 2016, Nucleic Acids Res..

[2]  Peer Bork,et al.  Genome-Wide Experimental Determination of Barriers to Horizontal Gene Transfer , 2007, Science.

[3]  Fangfang Xia,et al.  The SEED and the Rapid Annotation of microbial genomes using Subsystems Technology (RAST) , 2013, Nucleic Acids Res..

[4]  Otto X. Cordero,et al.  Ecology drives a global network of gene exchange connecting the human microbiome , 2011, Nature.

[5]  Andrew C. Pawlowski,et al.  The Comprehensive Antibiotic Resistance Database , 2013, Antimicrobial Agents and Chemotherapy.

[6]  Xin Chen,et al.  dbCAN: a web resource for automated carbohydrate-active enzyme annotation , 2012, Nucleic Acids Res..

[7]  Dan M. Bolser,et al.  Ensembl Genomes 2016: more genomes, more complexity , 2015, Nucleic Acids Res..

[8]  Fangfang Xia,et al.  Antimicrobial Resistance Prediction in PATRIC and RAST , 2016, Scientific Reports.

[9]  Pedro M. Coutinho,et al.  The carbohydrate-active enzymes database (CAZy) in 2013 , 2013, Nucleic Acids Res..

[10]  Neil Hall,et al.  Advanced sequencing technologies and their wider impact in microbiology , 2007, Journal of Experimental Biology.

[11]  Tatiana A. Tatusova,et al.  Update on RefSeq microbial genomes resources , 2014, Nucleic Acids Res..

[12]  Julian Parkhill,et al.  Microbiology in the post-genomic era , 2008, Nature Reviews Microbiology.

[13]  Yan Zhang,et al.  PATRIC, the bacterial bioinformatics database and analysis resource , 2013, Nucleic Acids Res..

[14]  Arthur Brady,et al.  MetaRef: a pan-genomic database for comparative and community microbial genomics , 2013, Nucleic Acids Res..

[15]  Luis Pedro Coelho,et al.  Fast Genome-Wide Functional Annotation through Orthology Assignment by eggNOG-Mapper , 2016, bioRxiv.

[16]  Davide Heller,et al.  eggNOG 4.5: a hierarchical orthology framework with improved functional annotations for eukaryotic, prokaryotic and viral sequences , 2015, Nucleic Acids Res..

[17]  R. Fleischmann,et al.  Whole-genome random sequencing and assembly of Haemophilus influenzae Rd. , 1995, Science.

[18]  Joaquín Dopazo,et al.  PTMcode v2: a resource for functional associations of post-translational modifications within and between proteins , 2014, Nucleic Acids Res..

[19]  B. Snel,et al.  Toward Automatic Reconstruction of a Highly Resolved Tree of Life , 2006, Science.

[20]  Minoru Kanehisa,et al.  KEGG as a reference resource for gene and protein annotation , 2015, Nucleic Acids Res..

[21]  R. Amann,et al.  The species concept for prokaryotes. , 2013, FEMS microbiology reviews.

[22]  R. Fleischmann,et al.  The Minimal Gene Complement of Mycoplasma genitalium , 1995, Science.

[23]  Alison S. Waller,et al.  Genomic variation landscape of the human gut microbiome , 2012, Nature.

[24]  Mark Borodovsky,et al.  Gene identification in prokaryotic genomes, phages, metagenomes, and EST sequences with GeneMarkS suite. , 2011, Current protocols in bioinformatics.

[25]  Alexandros Stamatakis,et al.  Metagenomic species profiling using universal phylogenetic marker genes , 2013, Nature Methods.

[26]  I-Min A. Chen,et al.  IMG 4 version of the integrated microbial genomes comparative analysis system , 2013, Nucleic Acids Res..

[27]  P. Bork,et al.  Accurate and universal delineation of prokaryotic species , 2013, Nature Methods.

[28]  Molly K. Gibson,et al.  Improved annotation of antibiotic resistance determinants reveals microbial resistomes cluster by ecology , 2014, The ISME Journal.

[29]  Luis Pedro Coelho,et al.  Fast Genome-Wide Functional Annotation through Orthology Assignment by eggNOG-Mapper , 2016, bioRxiv.