Improving Visualization of Large Hierarchical Clustering

The classical representation of a binary tree generated by a hierarchical clustering is a node-link-based visualization denoted as a dendrogram. It allows users to explore in a simple way the clusters and the relationships between instances. However, exploration of large dendrograms is known to be difficult due to the graphical and cognitive information overload involved. Here, we discuss the current approaches and we introduce Stacked Trees, a new Focus+Context visualization technique that allows the exploration of the hierarchical clustering of up to fifty thousands nodes on a standard-sized screen.

[1]  Stéphane Bourg,et al.  Collections of Compounds – How to Deal with them? , 2008 .

[2]  Serdar Tasiran,et al.  TreeJuxtaposer: scalable tree comparison using Focus+Context with guaranteed visibility , 2003, ACM Trans. Graph..

[3]  Antti Poso,et al.  Chemical space of orally active compounds , 2006 .

[4]  Catherine Plaisant,et al.  SpaceTree: supporting exploration in large node link tree, design evolution and empirical evaluation , 2002, IEEE Symposium on Information Visualization, 2002. INFOVIS 2002..

[5]  Ben Shneiderman,et al.  Tree visualization with tree-maps: 2-d space-filling approach , 1992, TOGS.

[6]  Rick Kazman,et al.  Interacting with Huge Hierarchies: Beyond Cone Trees , 2007 .

[7]  Gisbert Schneider,et al.  Kernel Approach to Molecular Similarity Based on Iterative Graph Similarity , 2007, J. Chem. Inf. Model..

[8]  Pierre Baldi,et al.  Graph kernels for chemical informatics , 2005, Neural Networks.

[9]  A. Hopkins,et al.  Navigating chemical space for biology and medicine , 2004, Nature.

[10]  Rick Kazman,et al.  Research report. Interacting with huge hierarchies: beyond cone trees , 1995, Proceedings of Visualization 1995 Conference.

[11]  Jock D. Mackinlay,et al.  Cone Trees: animated 3D visualizations of hierarchical information , 1991, CHI.

[12]  Richard S Paules,et al.  Heat map visualization of high-density clinical chemistry data. , 2007, Physiological genomics.

[13]  Ramana Rao,et al.  A focus+context technique based on hyperbolic geometry for visualizing large hierarchies , 1995, CHI '95.

[14]  George W. Furnas,et al.  A fisheye follow-up: further reflections on focus + context , 2006, CHI.

[15]  Jean-Daniel Fekete,et al.  NodeTrix: a Hybrid Visualization of Social Networks , 2007, IEEE Transactions on Visualization and Computer Graphics.

[16]  G. W. Furnas,et al.  Generalized fisheye views , 1986, CHI '86.

[17]  Michael Balzer,et al.  Voronoi treemaps for the visualization of software metrics , 2005, SoftVis '05.

[18]  Rui Xu,et al.  Survey of clustering algorithms , 2005, IEEE Transactions on Neural Networks.

[19]  Arjan Kuijper,et al.  Visual Analysis of Large Graphs: State‐of‐the‐Art and Future Research Challenges , 2011, Eurographics.

[20]  Monica Casale,et al.  A new algorithm for seriation and its use in similarity dendrograms , 2007 .

[21]  Isabelle Tellier,et al.  Cascade Evaluation of Clustering Algorithms , 2006, ECML.

[22]  Martin Wattenberg A note on space-filling visualizations and space-filling curves , 2005, IEEE Symposium on Information Visualization, 2005. INFOVIS 2005..

[23]  Xiaojun Guan,et al.  A novel method for large tree visualization , 2009, Bioinform..

[24]  Gilles Bisson,et al.  Clustering of Molecules: Influence of the Similarity Measures , 2007 .

[25]  Alexander Böcker,et al.  Toward an Improved Clustering of Large Data Sets Using Maximum Common Substructures and Topological Fingerprints , 2008, J. Chem. Inf. Model..

[26]  Jean-Philippe Vert,et al.  Graph kernels based on tree patterns for molecules , 2006, Machine Learning.

[27]  Erik D. Demaine,et al.  K-ary Clustering with Optimal Leaf Ordering for Gene Expression Data , 2002, WABI.

[28]  Enrico Bertini,et al.  Quality Metrics in High-Dimensional Data Visualization: An Overview and Systematization , 2011, IEEE Transactions on Visualization and Computer Graphics.

[29]  Eric Lecolinet,et al.  Browsing Zoomable Treemaps: Structure-Aware Multi-Scale Navigation Techniques , 2007, IEEE Transactions on Visualization and Computer Graphics.

[30]  Daniel A. Keim,et al.  Pixel bar charts: a new technique for visualizing large multi-attribute data sets without aggregation , 2001, IEEE Symposium on Information Visualization, 2001. INFOVIS 2001..

[31]  Mark H. Chignell,et al.  Elastic hierarchies: combining treemaps and node-link diagrams , 2005, IEEE Symposium on Information Visualization, 2005. INFOVIS 2005..

[32]  Maciej Haranczyk,et al.  Comparison of Similarity Coefficients for Clustering and Compound Selection , 2008, J. Chem. Inf. Model..

[33]  Ben Shneiderman,et al.  Interactively Exploring Hierarchical Clustering Results , 2002, Computer.

[34]  Pavel Berkhin,et al.  A Survey of Clustering Data Mining Techniques , 2006, Grouping Multidimensional Data.

[35]  Jean-Daniel Fekete,et al.  Interactive information visualization of a million items , 2002, IEEE Symposium on Information Visualization, 2002. INFOVIS 2002..

[36]  Alain Calvet,et al.  Molecular Property eXplorer: A Novel Approach to Visualizing SAR Using Tree-Maps and Heatmaps , 2005, J. Chem. Inf. Model..

[37]  Stefan Wetzel,et al.  The Scaffold Tree - Visualization of the Scaffold Universe by Hierarchical Scaffold Classification , 2007, J. Chem. Inf. Model..