A Space-Filling Multidimensional Visualization (SFMDVis) for Exploratory Data Analysis

Abstract The space-filling visualization model was first invented by Ben Shneiderman [28] for maximizing the utilization of display space in relational data (or graph) visualization, especially for tree visualization. It uses the concept of Enclosure which dismisses the “edges” in the graphic representation that are all too frequently used in traditional node-link based graph visualizations. Therefore, the major issue in graph visualization which is the edge crossing can be naturally solved through the adoption of a space filling approach. However in the past, the space-filling concept has not attracted much attention from researchers in the field of multidimensional visualization. Although the problem of ‘edge crossing’ has also occurred among polylines which are used as the basic visual elements in the parallel coordinates visualization, it is problematic if those ‘edge crossings’ among polylines are not evenly distributed on the display plate as visual clutter will occur. This problem could significantly reduce the human readability in terms of reviewing a particular region of the visualization. In this study, we propose a new Space-Filling Multidimensional Data Visualization (SFMDVis) that for the first-time introduces a space-filling approach into multidimensional data visualization. The main contributions are: (1) achieving the maximization of space utilization in multidimensional visualization (i.e. 100% of the display area is fully used), (2) eliminating visual clutter in SFMDVis through the use of the non-classic geometric primitive and (3) improving the quality of visualization for the visual perception of linear correlations among different variables as well as recognizing data patterns. To evaluate the quality of SFMDVis, we have conducted a usability study to measure the performance of SFMDVis in comparison with parallel coordinates and a scatterplot matrix for finding linear correlations and data patterns. The evaluation results have suggested that the accuracy of SFMDVis is better than both in terms of perceiving linear correlations and also that the SFMDVis is more efficient (less time is required) than both when recognizing data patterns.

[1]  Ben Shneiderman,et al.  The eyes have it: a task by data type taxonomy for information visualizations , 1996, Proceedings 1996 IEEE Symposium on Visual Languages.

[2]  F. Wilcoxon Individual Comparisons by Ranking Methods , 1945 .

[3]  Ye Zhao,et al.  Tile-based parallel coordinates and its application in financial visualization , 2010, Electronic Imaging.

[4]  Jonathan Goldstein,et al.  When Is ''Nearest Neighbor'' Meaningful? , 1999, ICDT.

[5]  E. Wegman Hyperdimensional Data Analysis Using Parallel Coordinates , 1990 .

[6]  John T. Behrens,et al.  Principles and procedures of exploratory data analysis. , 1997 .

[7]  Ben Shneiderman,et al.  Tree visualization with tree-maps: 2-d space-filling approach , 1992, TOGS.

[8]  Kang Zhang,et al.  An Interactive Scatter Plot Metrics Visualization for Decision Trend Analysis , 2012, 2012 11th International Conference on Machine Learning and Applications.

[9]  H. Sagan Space-filling curves , 1994 .

[10]  Eser Kandogan Star Coordinates: A Multi-dimensional Visualization Technique with Uniform Treatment of Dimensions , 2000 .

[11]  Hong Zhou,et al.  Visual Clustering in Parallel Coordinates , 2008, Comput. Graph. Forum.

[12]  R. Bellman Dynamic programming. , 1957, Science.

[13]  Matthew O. Ward,et al.  Visual Hierarchical Dimension Reduction for Exploration of High Dimensional Datasets , 2003, VisSym.

[14]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[15]  Xiaoru Yuan,et al.  Interactive local clustering operations for high dimensional data in parallel coordinates , 2010, 2010 IEEE Pacific Visualization Symposium (PacificVis).

[16]  R. Bakeman Recommended effect size statistics for repeated measures designs , 2005, Behavior research methods.

[17]  Daniel J. Denis,et al.  The early origins and development of the scatterplot. , 2005, Journal of the history of the behavioral sciences.

[18]  I. Ntzoufras Gibbs Variable Selection using BUGS , 2002 .

[19]  J. A. Hartigan,et al.  A k-means clustering algorithm , 1979 .

[20]  Daniel A. Keim,et al.  Pixel bar charts: a visualization technique for very large multi-attribute data sets? , 2002, Inf. Vis..

[21]  Jesse S. Jin,et al.  Parallel Rough Set: Dimensionality Reduction and Feature Discovery of Multi-Dimensional Data in Visualization , 2011, ICONIP.

[22]  Kazuho Watanabe,et al.  Spectral-Based Contractible Parallel Coordinates , 2014, 2014 18th International Conference on Information Visualisation.

[23]  Almir Olivette Artero,et al.  Uncovering Clusters in Crowded Parallel Coordinates Visualizations , 2004 .

[24]  Christopher G. Healey,et al.  Choosing effective colours for data visualization , 1996, Proceedings of Seventh Annual IEEE Visualization '96.

[25]  Stefan Berchtold,et al.  Similarity clustering of dimensions for an enhanced visualization of multidimensional data , 1998, Proceedings IEEE Symposium on Information Visualization (Cat. No.98TB100258).

[26]  Alfred Inselberg,et al.  The plane with parallel coordinates , 1985, The Visual Computer.

[27]  Ramana Rao,et al.  The table lens: merging graphical and symbolic representations in an interactive focus + context visualization for tabular information , 1994, CHI '94.

[28]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[29]  Martin Theus,et al.  Interactive Data Visualization using Mondrian , 2002 .

[30]  Hans-Peter Kriegel,et al.  'Circle Segments': A Technique for Visually Exploring Large Multidimensional Data Sets , 1996 .

[31]  Cynthia A. Brewer,et al.  ColorBrewer.org: An Online Tool for Selecting Colour Schemes for Maps , 2003 .

[32]  Andreas Buja,et al.  Exploratory Visual Analysis of Graphs in GGOBI , 2004 .

[33]  Helwig Hauser,et al.  Angular brushing of extended parallel coordinates , 2002, IEEE Symposium on Information Visualization, 2002. INFOVIS 2002..

[34]  M. Friendly Corrgrams , 2002 .

[35]  Matthew O. Ward,et al.  Clutter Reduction in Multi-Dimensional Data Visualization Using Dimension Reordering , 2004 .