Hierarchy-based projection of high-dimensional labeled data to reduce visual clutter

Abstract Visualizing high-dimensional labeled data on a two-dimensional plane can quickly result in visual clutter and information overload. To address this problem, the data usually needs to be structured, so that only parts of it are displayed at a time. We present a hierarchy-based approach that projects labeled data on different levels of detail on a two-dimensional plane, whilst keeping the user׳s cognitive load between the level changes as low as possible. The approach consists of three steps: First, the data is hierarchically clustered; second, the user can determine levels of detail; third, the levels of detail are visualized one at a time on a two-dimensional plane. Animations make transitions between the levels of detail traceable, while the exploration on each level is supported by several interaction techniques, including halos, a darts view, and a magic lens. We demonstrate the applicability and usefulness of the approach with use cases from the patent domain and a question-and-answer website. In addition, we conducted a qualitative evaluation to assess the usefulness and comprehensibility of our approach.

[1]  Thomas Ertl,et al.  Word Cloud Explorer: Text Analytics Based on Word Clouds , 2014, 2014 47th Hawaii International Conference on System Sciences.

[2]  Boris Müller,et al.  Probing Projections: Interaction Techniques for Interpreting Arrangements and Errors of Dimensionality Reductions , 2016, IEEE Transactions on Visualization and Computer Graphics.

[3]  Yee Whye Teh,et al.  Bayesian Rose Trees , 2010, UAI.

[4]  Edward M. Reingold,et al.  Graph drawing by force‐directed placement , 1991, Softw. Pract. Exp..

[5]  Qi Han,et al.  Visual Clutter Reduction through Hierarchy-based Projection of High-dimensional Labeled Data , 2016, Graphics Interface.

[6]  Donald W. Bouldin,et al.  A Cluster Separation Measure , 1979, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Ben Shneiderman,et al.  The eyes have it: a task by data type taxonomy for information visualizations , 1996, Proceedings 1996 IEEE Symposium on Visual Languages.

[8]  Michel Verleysen,et al.  Stability Comparison of Dimensionality Reduction Techniques Attending to Data and Parameter Variations , 2013, VAMP@EuroVis.

[9]  Jean-Daniel Fekete,et al.  Hierarchical Aggregation for Information Visualization: Overview, Techniques, and Design Guidelines , 2010, IEEE Transactions on Visualization and Computer Graphics.

[10]  Pak Chung Wong,et al.  Discovering Knowledge Through Visual Analysis , 2001, J. Univers. Comput. Sci..

[11]  Manojit Sarkar,et al.  Graphical fisheye views of graphs , 1992, CHI.

[12]  Daniel A. Keim,et al.  Rolled‐out Wordles: A Heuristic Method for Overlap Removal of 2D Data Representatives , 2012, Comput. Graph. Forum.

[13]  James J. Thomas,et al.  Visualizing the non-visual: spatial analysis and interaction with information from text documents , 1995, Proceedings of Visualization 1995 Conference.

[14]  Masahiro Ueno,et al.  A Clustering Method Using Hierarchical Self-Organizing Maps , 2002, J. VLSI Signal Process..

[15]  Ye Zhao,et al.  Real-Time Visualization of Streaming Text with a Force-Based Dynamic System , 2012, IEEE Computer Graphics and Applications.

[16]  Takeshi Yamada,et al.  Topigraphy: visualization for large-scale tag clouds , 2008, WWW.

[17]  Rosane Minghim,et al.  Semantic Wordification of Document Collections , 2012, Comput. Graph. Forum.

[18]  Paul Geladi,et al.  Principal Component Analysis , 1987, Comprehensive Chemometrics.

[19]  Ponnuthurai N. Suganthan Hierarchical overlapped SOM's for pattern classification , 1999, IEEE Trans. Neural Networks.

[20]  Wolfgang Kienreich,et al.  On the Beauty and Usability of Tag Clouds , 2008, 2008 12th International Conference Information Visualisation.

[21]  Eric O. Postma,et al.  Dimensionality Reduction: A Comparative Review , 2008 .

[22]  Tamara Munzner,et al.  Overview: The Design, Adoption, and Analysis of a Visual Document Mining Tool for Investigative Journalists , 2014, IEEE Transactions on Visualization and Computer Graphics.

[23]  Patrick Baudisch,et al.  Halo: a Technique for Visualizing Off-Screen Locations , 2003 .

[24]  Yusef Hassan-Montero,et al.  Improving Tag-Clouds as Visual Information Retrieval Interfaces , 2024, 2401.04947.

[25]  Teuvo Kohonen,et al.  The self-organizing map , 1990, Neurocomputing.

[26]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[27]  Andreas Butz,et al.  TagClusters: Semantic Aggregation of Collaborative Tags beyond TagClouds , 2009, Smart Graphics.

[28]  Haim Levkowitz,et al.  Least Square Projection: A Fast High-Precision Multidimensional Projection Technique and Its Application to Document Mapping , 2008, IEEE Transactions on Visualization and Computer Graphics.

[29]  Christian Posse,et al.  IN-SPIRE InfoVis 2004 Contest Entry , 2004 .

[30]  Patrick Baudisch,et al.  Halo: a technique for visualizing off-screen objects , 2003, CHI '03.

[31]  Heidrun Schumann,et al.  A Survey on Interactive Lenses in Visualization , 2014, EuroVis.

[32]  Qi Han,et al.  Visual Exploration of Patent Collections with IPC Clouds , 2014, IPaMin@KONVENS.

[33]  Kwan-Liu Ma,et al.  Semantic‐Preserving Word Clouds by Seam Carving , 2011, Comput. Graph. Forum.

[34]  William Ribarsky,et al.  HierarchicalTopics: Visually Exploring Large Text Collections Using Topic Hierarchies , 2013, IEEE Transactions on Visualization and Computer Graphics.

[35]  Daniel A. Keim,et al.  Visual Interaction with Dimensionality Reduction: A Structured Literature Analysis , 2017, IEEE Transactions on Visualization and Computer Graphics.

[36]  Vincent Ng,et al.  Automatic Keyphrase Extraction: A Survey of the State of the Art , 2014, ACL.

[37]  V. A. Epanechnikov Non-Parametric Estimation of a Multivariate Probability Density , 1969 .

[38]  Daniel Fried,et al.  Maps of Computer Science , 2013, 2014 IEEE Pacific Visualization Symposium.

[39]  Ben Shneiderman,et al.  Tree visualization with tree-maps: 2-d space-filling approach , 1992, TOGS.

[40]  Emden R. Gansner,et al.  Improved Force-Directed Layouts , 1998, GD.

[41]  Michael Burch,et al.  Prefix Tag Clouds , 2013, 2013 17th International Conference on Information Visualisation.

[42]  Baining Guo,et al.  TopicPanorama: A full picture of relevant topics , 2014, IEEE VAST.

[43]  Thomas Ertl,et al.  TreeQueST: A Treemap-Based Query Sandbox for Microdocument Retrieval , 2015, 2015 48th Hawaii International Conference on System Sciences.

[44]  Steffen Lohmann,et al.  Comparison of Tag Cloud Layouts: Task-Related Performance and Visual Exploration , 2009, INTERACT.

[45]  Bongshin Lee,et al.  ManiWordle: Providing Flexible Control over Wordle , 2010, IEEE Transactions on Visualization and Computer Graphics.

[46]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[47]  M. Sheelagh T. Carpendale,et al.  DocuBurst: Visualizing Document Content using Language Structure , 2009, Comput. Graph. Forum.