PHYLOViZ 2.0: providing scalable data integration and visualization for multiple phylogenetic inference methods

High Throughput Sequencing provides a cost effective means of generating high resolution data for hundreds or even thousands of strains, and is rapidly superseding methodologies based on a few genomic loci. The wealth of genomic data deposited on public databases such as Sequence Read Archive/European Nucleotide Archive provides a powerful resource for evolutionary analysis and epidemiological surveillance. However, many of the analysis tools currently available do not scale well to these large datasets, nor provide the means to fully integrate ancillary data. Here we present PHYLOViZ 2.0, an extension of PHYLOViZ tool, a platform independent Java tool that allows phylogenetic inference and data visualization for large datasets of sequence based typing methods, including Single Nucleotide Polymorphism (SNP) and whole genome/core genome Multilocus Sequence Typing (wg/cgMLST) analysis. PHYLOViZ 2.0 incorporates new data analysis algorithms and new visualization modules, as well as the capability of saving projects for subsequent work or for dissemination of results. AVAILABILITY AND IMPLEMENTATION http://www.phyloviz.net/ (licensed under GPLv3). CONTACT cvaz@inesc-id.ptSupplementary information: Supplementary data are available at Bioinformatics online.

[1]  N. Saitou,et al.  The neighbor-joining method: a new method for reconstructing phylogenetic trees. , 1987, Molecular biology and evolution.

[2]  R. Sokal,et al.  Numerical Taxonomy: The Principles and Practice of Numerical Classification. , 1975 .

[3]  J. Bray,et al.  MLST revisited: the gene-by-gene approach to bacterial genomics , 2013, Nature Reviews Microbiology.

[4]  Alexandre P. Francisco,et al.  PHYLOViZ: phylogenetic inference and data visualization for sequence based typing methods , 2012, BMC Bioinformatics.

[5]  Shlomo Moran,et al.  Optimal implementations of UPGMA and other common clustering algorithms , 2007, Inf. Process. Lett..

[6]  B. Spratt Multilocus sequence typing: molecular typing of bacterial pathogens in an era of rapid DNA sequencing and the internet. , 1999, Current opinion in microbiology.

[7]  J. A. Studier,et al.  A note on the neighbor-joining algorithm of Saitou and Nei. , 1988, Molecular biology and evolution.

[8]  João André Carriço,et al.  Bioinformatics in bacterial molecular epidemiology and public health: databases, tools and the next-generation sequencing revolution. , 2013, Euro surveillance : bulletin Europeen sur les maladies transmissibles = European communicable disease bulletin.

[9]  W. Hanage,et al.  eBURST: Inferring Patterns of Evolutionary Descent among Clusters of Related Bacterial Genotypes from Multilocus Sequence Typing Data , 2004, Journal of bacteriology.

[10]  Alexandre P. Francisco,et al.  Global optimal eBURST analysis of multilocus typing data using a graphic matroid approach , 2009, BMC Bioinformatics.

[11]  Matthew Suderman,et al.  Tools for visually exploring biological networks , 2007, Bioinform..

[12]  Martin C. J. Maiden,et al.  Bioinformatics Applications Note Sequence Type Analysis and Recombinational Tests (start) , 2022 .