Hierarchical parallel coordinates for exploration of large datasets

Our ability to accumulate large, complex (multivariate) data sets has far exceeded our ability to effectively process them in searching for patterns, anomalies and other interesting features. Conventional multivariate visualization techniques generally do not scale well with respect to the size of the data set. The focus of this paper is on the interactive visualization of large multivariate data sets based on a number of novel extensions to the parallel coordinates display technique. We develop a multi-resolution view of the data via hierarchical clustering, and use a variation of parallel coordinates to convey aggregation information for the resulting clusters. Users can then navigate the resulting structure until the desired focus region and level of detail is reached, using our suite of navigational and filtering tools. We describe the design and implementation of our hierarchical parallel coordinates system which is based on extending the XmdvTool system. Lastly, we show examples of the tools and techniques applied to large (hundreds of thousands of records) multivariate data sets.

[1]  R. Fisher THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS , 1936 .

[2]  E. Wegman Hyperdimensional Data Analysis Using Parallel Coordinates , 1990 .

[3]  Andrew W. Mead Review of the Development of Multidimensional Scaling Methods , 1992 .

[4]  Matthew O. Ward,et al.  High Dimensional Brushing for Interactive Exploration of Multivariate Data , 1995, Proceedings Visualization '95.

[5]  Mark D. Apperley,et al.  A review and taxonomy of distortion-oriented presentation techniques , 1994, TCHI.

[6]  Sudipto Guha,et al.  CURE: an efficient clustering algorithm for large databases , 1998, SIGMOD '98.

[7]  Sougata Mukherjea,et al.  Glyphmaker: creating customized visualizations of complex data , 1994, Computer.

[8]  Pak Chung Wong,et al.  Multiresolution multidimensional wavelet brushing , 1996, Proceedings of Seventh Annual IEEE Visualization '96.

[9]  Ben Shneiderman,et al.  Tree visualization with tree-maps: 2-d space-filling approach , 1992, TOGS.

[10]  Hing-Yan Lee,et al.  Visualization Support for Data Mining , 1996, IEEE Expert.

[11]  Ramana Rao,et al.  Exploring large tables with the table lens , 1995, CHI '95.

[12]  Matthew O. Ward,et al.  Navigating hierarchies with structure-based brushes , 1999, Proceedings 1999 IEEE Symposium on Information Visualization (InfoVis'99).

[13]  Graham J. Wills,et al.  An interactive view for hierarchical clustering , 1998, Proceedings IEEE Symposium on Information Visualization (Cat. No.98TB100258).

[14]  Alfred Inselberg,et al.  Parallel coordinates: a tool for visualizing multi-dimensional geometry , 1990, Proceedings of the First IEEE Conference on Visualization: Visualization `90.

[15]  Matthew O. Ward,et al.  XmdvTool: integrating multiple methods for visualizing multivariate data , 1994, Proceedings Visualization '94.

[16]  D. A. Duce,et al.  Visualization in Scientific Computing , 1994, Focus on Computer Graphics.

[17]  M. E. McGill,et al.  Dynamic Graphics for Statistics , 1988 .

[18]  Herman Chernoff,et al.  The Use of Faces to Represent Points in k- Dimensional Space Graphically , 1973 .

[19]  John W. Tukey,et al.  Exploratory Data Analysis. , 1979 .

[20]  Teuvo Kohonen,et al.  The self-organizing map , 1990, Neurocomputing.

[21]  M. Braga,et al.  Exploratory Data Analysis , 2018, Encyclopedia of Social Network Analysis and Mining. 2nd Ed..

[22]  Hans-Peter Kriegel,et al.  Recursive pattern: a technique for visualizing very large amounts of data , 1995, Proceedings Visualization '95.

[23]  Alfred Inselberg,et al.  Parallel coordinates for visualizing multi-dimensional geometry , 1987 .

[24]  Anil K. Jain,et al.  Algorithms for Clustering Data , 1988 .

[25]  Matthew O. Ward,et al.  Exploring N-dimensional databases , 1990, Proceedings of the First IEEE Conference on Visualization: Visualization `90.

[26]  Edward J. Wegman,et al.  High Dimensional Clustering Using Parallel Coordinates and the Grand Tour , 1997 .

[27]  Tian Zhang,et al.  BIRCH: an efficient data clustering method for very large databases , 1996, SIGMOD '96.

[28]  D. F. Andrews,et al.  PLOTS OF HIGH-DIMENSIONAL DATA , 1972 .

[29]  Matthew O. Ward,et al.  Perceptual Benchmarking for Multivariate Data Visualization , 1997, Scientific Visualization Conference (dagstuhl '97).

[30]  Steven K. Feiner,et al.  Worlds within worlds: metaphors for exploring n-dimensional virtual worlds , 1990, UIST '90.