A review of bioinformatics platforms for comparative genomics. Recent developments of the EDGAR 2.0 platform and its utility for taxonomic and phylogenetic studies.

The rapid development of next generation sequencing technology has greatly increased the amount of available microbial genomes. As a result of this development, there is a rising demand for fast and automated approaches in analyzing these genomes in a comparative way. Whole genome sequencing also bears a huge potential for obtaining a higher resolution in phylogenetic and taxonomic classification. During the last decade, several software tools and platforms have been developed in the field of comparative genomics. In this manuscript, we review the most commonly used platforms and approaches for ortholog group analyses with a focus on their potential for phylogenetic and taxonomic research. Furthermore, we describe the latest improvements of the EDGAR platform for comparative genome analyses and present recent examples of its application for the phylogenomic analysis of different taxa. Finally, we illustrate the role of the EDGAR platform as part of the BiGi Center for Microbial Bioinformatics within the German network on Bioinformatics Infrastructure (de.NBI).

[1]  J. Chun,et al.  Introducing EzBioCloud: a taxonomically united database of 16S rRNA gene sequences and whole-genome assemblies , 2017, International journal of systematic and evolutionary microbiology.

[2]  Matthew R. Laird,et al.  OrtholugeDB: a bacterial and archaeal orthology resource for improved comparative genomic analysis , 2012, Nucleic Acids Res..

[3]  K. Konstantinidis,et al.  Genomic insights that advance the species definition for prokaryotes. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[4]  Alexander Goesmann,et al.  EDGAR: A software framework for the comparative analysis of prokaryotic genomes , 2009, BMC Bioinformatics.

[5]  Natalia N. Ivanova,et al.  Microbial species delineation using whole genome sequences , 2015, Nucleic acids research.

[6]  P. Kämpfer,et al.  Prokaryotic taxonomy in the sequencing era--the polyphasic approach revisited. , 2012, Environmental microbiology.

[7]  H. Tettelin,et al.  The microbial pan-genome. , 2005, Current opinion in genetics & development.

[8]  R. Giegerich,et al.  GenDB--an open source genome annotation system for prokaryote genomes. , 2003, Nucleic acids research.

[9]  Malcolm D. Walkinshaw,et al.  Sequence specificity of single-stranded DNA-binding proteins: a novel DNA microarray approach , 2007, Nucleic acids research.

[10]  Joanne R. Winter,et al.  Interpreting whole genome sequencing for investigating tuberculosis transmission: a systematic review , 2016, BMC Medicine.

[11]  Pardis C Sabeti,et al.  Genomic Analysis of Viral Outbreaks. , 2016, Annual review of virology.

[12]  Inna Dubchak,et al.  MicrobesOnline: an integrated portal for comparative and functional genomics , 2009, Nucleic Acids Res..

[13]  P. Vandamme,et al.  DNA-DNA hybridization values and their relationship to whole-genome sequence similarities. , 2007, International journal of systematic and evolutionary microbiology.

[14]  D. Raoult,et al.  A polyphasic strategy incorporating genomic data for the taxonomic description of novel bacterial species. , 2014, International journal of systematic and evolutionary microbiology.

[15]  Hirokazu Chiba,et al.  MBGD update 2015: microbial genome database for flexible ortholog analysis utilizing a diverse set of genomic data , 2014, Nucleic Acids Res..

[16]  H. S. Heaps,et al.  Information retrieval, computational and theoretical aspects , 1978 .

[17]  J. Chun,et al.  Towards a taxonomic coherence between average nucleotide identity and 16S rRNA gene sequence similarity for species demarcation of prokaryotes. , 2014, International Journal of Systematic and Evolutionary Microbiology.

[18]  I-Min A. Chen,et al.  IMG/M: integrated genome and metagenome comparative data analysis system , 2016, Nucleic Acids Res..

[19]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[20]  J. Chun,et al.  Integrating genomics into the taxonomy and systematics of the Bacteria and Archaea. , 2014, International journal of systematic and evolutionary microbiology.

[21]  T. H. Smits,et al.  Insect pathogenicity in plant-beneficial pseudomonads: phylogenetic distribution and comparative genomics , 2016, The ISME Journal.

[22]  Meng Wang,et al.  Genomic insights into the taxonomic status of the Bacillus cereus group , 2015, Scientific Reports.

[23]  S. Derzelle,et al.  Whole Genome-Sequencing and Phylogenetic Analysis of a Historical Collection of Bacillus anthracis Strains from Danish Cattle , 2015, PloS one.

[24]  K. Konstantinidis,et al.  The bacterial species definition in the genomic era , 2006, Philosophical Transactions of the Royal Society B: Biological Sciences.

[25]  D. Pribnow Nucleotide sequence of an RNA polymerase binding site at an early T7 promoter. , 1975, Proceedings of the National Academy of Sciences of the United States of America.

[26]  C. Bron,et al.  Algorithm 457: finding all cliques of an undirected graph , 1973 .

[27]  I-Min A. Chen,et al.  IMG/M: a data management and analysis system for metagenomes , 2007, Nucleic Acids Res..

[28]  P. Vandamme,et al.  Time to revisit polyphasic taxonomy , 2014, Antonie van Leeuwenhoek.

[29]  Robert C. Edgar,et al.  MUSCLE: a multiple sequence alignment method with reduced time and space complexity , 2004, BMC Bioinformatics.

[30]  R. Amann,et al.  The species concept for prokaryotes. , 2013, FEMS microbiology reviews.

[31]  Hans-Peter Klenk,et al.  When should a DDH experiment be mandatory in microbial taxonomy? , 2013, Archives of Microbiology.

[32]  M. Kimura A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences , 1980, Journal of Molecular Evolution.

[33]  J. Blom,et al.  Phylogenomic re-assessment of the thermophilic genus Geobacillus. , 2016, Systematic and applied microbiology.

[34]  Ikuo Uchiyama,et al.  MBGD: microbial genome database for comparative analysis , 2003, Nucleic Acids Res..

[35]  George M. Garrity,et al.  Then and now: a systematic review of the systematics of prokaryotes in the last 80 years , 2013, Antonie van Leeuwenhoek.

[36]  O. Kandler,et al.  International Committee on Systematic Bacteriology: announcement of the report of the ad hoc Committee on Reconciliation of Approaches to Bacterial Systematics. , 1987, Zentralblatt fur Bakteriologie, Mikrobiologie, und Hygiene. Series A, Medical microbiology, infectious diseases, virology, parasitology.

[37]  T. H. Smits,et al.  Phylogenomic resolution of the bacterial genus Pantoea and its relationship with Erwinia and Tatumella , 2017, Antonie van Leeuwenhoek.

[38]  David R. Riley,et al.  Comparative genomics: the bacterial pan-genome. , 2008, Current opinion in microbiology.

[39]  S. Haas,et al.  Temporal transcriptomic analysis of the Listeria monocytogenes EGD-e σB regulon , 2008, BMC Microbiology.

[40]  N. Saitou,et al.  The neighbor-joining method: a new method for reconstructing phylogenetic trees. , 1987, Molecular biology and evolution.

[41]  Andrew J. Page,et al.  Roary: rapid large-scale prokaryote pan genome analysis , 2015, bioRxiv.

[42]  Peter Kämpfer,et al.  Multilocus sequence analysis (MLSA) in prokaryotic taxonomy. , 2015, Systematic and applied microbiology.

[43]  Alexander Goesmann,et al.  EDGAR 2.0: an enhanced software platform for comparative gene content analyses , 2016, Nucleic Acids Res..

[44]  Matthew R. Laird,et al.  BMC Bioinformatics BioMed Central Methodology article Improving the specificity of high-throughput ortholog prediction , 2006 .

[45]  Ramon Rosselló-Móra,et al.  DNA-DNA Reassociation Methods Applied to Microbial Taxonomy and Their Critical Evaluation , 2006 .

[46]  Erko Stackebrandt,et al.  Taxonomic Note: A Place for DNA-DNA Reassociation and 16S rRNA Sequence Analysis in the Present Species Definition in Bacteriology , 1994 .

[47]  H. Nishida,et al.  Whole-genome comparison clarifies close phylogenetic relationships between the phyla Dictyoglomi and Thermotogae. , 2011, Genomics.

[48]  Jaideep P. Sundaram,et al.  Genome analysis of multiple pathogenic isolates of Streptococcus agalactiae: implications for the microbial "pan-genome". , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[49]  R. Rosselló-Móra,et al.  Shifting the genomic gold standard for the prokaryotic species definition , 2009, Proceedings of the National Academy of Sciences.

[50]  J. Blom,et al.  Next-generation systematics: An innovative approach to resolve the structure of complex prokaryotic taxa , 2016, Scientific Reports.

[51]  Alexander Goesmann,et al.  ReadXplorer 2—detailed read mapping analysis and visualization from one single source , 2016, Bioinform..

[52]  A. Goesmann,et al.  Phylogenomic grouping of Listeria monocytogenes. , 2015, Canadian journal of microbiology.

[53]  Jens Stoye,et al.  ReadXplorer—visualization and analysis of mapped sequences , 2014, Bioinform..

[54]  Keith A. Jolley,et al.  Ribosomal multilocus sequence typing: universal characterization of bacteria from domain to strain , 2012, Microbiology.

[55]  W. Ludwig,et al.  Notes on the characterization of prokaryote strains for taxonomic purposes. , 2010, International journal of systematic and evolutionary microbiology.

[56]  Konstantinos T. Konstantinidis,et al.  Towards a Genome-Based Taxonomy for Prokaryotes , 2005, Journal of bacteriology.

[57]  Bernard R. Baum,et al.  Book Review:PHYLIP: Phylogeny Inference Package. Version 3.2. Joel Felsenstein , 1989 .