Interactive visual comparison of multiple trees

Traditionally, the visual analysis of hierarchies, respectively, trees, is conducted by focusing on one given hierarchy. However, in many research areas multiple, differing hierarchies need to be analyzed simultaneously in a comparative way - in particular to highlight differences between them, which sometimes can be subtle. A prominent example is the analysis of so-called phylogenetic trees in biology, reflecting hierarchical evolutionary relationships among a set of organisms. Typically, the analysis considers multiple phylogenetic trees, either to account for statistical significance or for differences in derivation of such evolutionary hierarchies; for example, based on different input data, such as the 16S ribosomal RNA and protein sequences of highly conserved enzymes. The simultaneous analysis of a collection of such trees leads to more insight into the evolutionary process. We introduce a novel visual analytics approach for the comparison of multiple hierarchies focusing on both global and local structures. A new tree comparison score has been elaborated for the identification of interesting patterns. We developed a set of linked hierarchy views showing the results of automatic tree comparison on various levels of details. This combined approach offers detailed assessment of local and global tree similarities. The approach was developed in close cooperation with experts from the evolutionary biology domain. We apply it to a phylogenetic data set on bacterial ancestry, demonstrating its application benefit.

[1]  Bartek Wilczynski,et al.  Biopython: freely available Python tools for computational molecular biology and bioinformatics , 2009, Bioinform..

[2]  Yaw-Ling Lin,et al.  Efficient Algorithms for Descendent Subtrees Comparison of Phylogenetic Trees with Applications to Co-evolutionary Classifications in Bacterial Genome , 2003, ISAAC.

[3]  Kay Hamacher,et al.  Distance‐dependent classification of amino acids by information theory , 2010, Proteins.

[4]  Peer Bork,et al.  Interactive Tree Of Life (iTOL): an online tool for phylogenetic tree display and annotation , 2007, Bioinform..

[5]  Daniel H. Huson,et al.  Dendroscope: An interactive viewer for large phylogenetic trees , 2007, BMC Bioinformatics.

[6]  Christian J. A. Sigrist,et al.  Nucleic Acids Research Advance Access published November 14, 2007 The 20 years of PROSITE , 2007 .

[7]  O. Gascuel,et al.  Approximate likelihood-ratio test for branches: A fast, accurate, and powerful alternative. , 2006, Systematic biology.

[8]  Martin Graham,et al.  A Survey of Multiple Tree Visualisation , 2010, Inf. Vis..

[9]  L. Holm,et al.  The Pfam protein families database , 2005, Nucleic Acids Res..

[10]  Robert D. Finn,et al.  The Pfam protein families database , 2004, Nucleic Acids Res..

[11]  Tobias Schreck,et al.  Visual analysis of graphs with multiple connected components , 2009, 2009 IEEE Symposium on Visual Analytics Science and Technology.

[12]  Natalia N. Ivanova,et al.  A phylogeny-driven genomic encyclopaedia of Bacteria and Archaea , 2009, Nature.

[13]  James O. McInerney,et al.  TOPD/FMTS: a new software to compare phylogenetic trees , 2007, Bioinform..

[14]  Robert C. Edgar,et al.  MUSCLE: multiple sequence alignment with high accuracy and high throughput. , 2004, Nucleic acids research.

[15]  Wei Qian,et al.  Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis. , 2000, Molecular biology and evolution.

[16]  T. M. Nye Trees of trees: an approach to comparing multiple alternative phylogenies. , 2008, Systematic biology.

[17]  Stuart K. Card,et al.  Degree-of-interest trees: a component of an attention-reactive user interface , 2002, AVI '02.

[18]  Ben Shneiderman,et al.  ManyNets: an interface for multiple network analysis and visualization , 2010, CHI.

[19]  C. Meacham,et al.  A general method for tree-comparison based on subtree similarity and its use in a taxonomic database. , 1997, Bio Systems.

[20]  Jean-Daniel Fekete,et al.  Naviguer dans des grands arbres avec ControlTree , 2007, IHM '07.

[21]  Jeffrey Heer,et al.  DOITrees revisited: scalable, space-constrained visualization of hierarchical data , 2004, AVI.

[22]  Jean-Daniel Fekete,et al.  Hierarchical Aggregation for Information Visualization: Overview, Techniques, and Design Guidelines , 2010, IEEE Transactions on Visualization and Computer Graphics.

[23]  Han-Wei Shen,et al.  Balloon Focus: a Seamless Multi-Focus+Context Method for Treemaps , 2008, IEEE Transactions on Visualization and Computer Graphics.

[24]  Jean-Michel Claverie,et al.  Phylogeny.fr: robust phylogenetic analysis for the non-specialist , 2008, Nucleic Acids Res..

[25]  Jarke J. van Wijk,et al.  Visual Comparison of Hierarchically Organized Data , 2008, Comput. Graph. Forum.

[26]  D. Hillis,et al.  Analysis and visualization of tree space. , 2005, Systematic biology.

[27]  Arjan Kuijper,et al.  Visual Analysis of Large Graphs: State‐of‐the‐Art and Future Research Challenges , 2011, Eurographics.

[28]  Xin Chen,et al.  An information-based sequence distance and its application to whole mitochondrial genome phylogeny , 2001, Bioinform..

[29]  M. Steel,et al.  Distributions of Tree Comparison Metrics—Some New Results , 1993 .

[30]  Ben Shneiderman,et al.  Tree visualization with tree-maps: 2-d space-filling approach , 1992, TOGS.

[31]  Alexandru Telea,et al.  Code Flows: Visualizing Structural Evolution of Source Code , 2008, Comput. Graph. Forum.

[32]  Jin Chen,et al.  Constructing Overview + Detail Dendrogram-Matrix Views , 2009, IEEE Transactions on Visualization and Computer Graphics.

[33]  Nicholas Chen,et al.  TreeJuxtaposer : Scalable Tree Comparison using Focus + Context with Guaranteed Visibility , 2006 .

[34]  Han-Wei Shen,et al.  Visualizing Changes of Hierarchical Data using Treemaps , 2007, IEEE Transactions on Visualization and Computer Graphics.

[35]  O. Gascuel,et al.  A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. , 2003, Systematic biology.

[36]  Kay Hamacher,et al.  Protein Domain Phylogenies - Information Theory and Evolutionary Dynamics , 2010, BIOINFORMATICS.