Parallel Sets: interactive exploration and visual analysis of categorical data

Categorical data dimensions appear in many real-world data sets, but few visualization methods exist that properly deal with them. Parallel Sets are a new method for the visualization and interactive exploration of categorical data that shows data frequencies instead of the individual data points. The method is based on the axis layout of parallel coordinates, with boxes representing the categories and parallelograms between the axes showing the relations between categories. In addition to the visual representation, we designed a rich set of interactions. Parallel Sets allow the user to interactively remap the data to new categorizations and, thus, to consider more data dimensions during exploration and analysis than usually possible. At the same time, a metalevel, semantic representation of the data is built. Common procedures, like building the cross product of two or more dimensions, can be performed automatically, thus complementing the interactive visualization. We demonstrate Parallel Sets by analyzing a large CRM data set, as well as investigating housing data from two US states.

[1]  Ben Shneiderman,et al.  The eyes have it: a task by data type taxonomy for information visualizations , 1996, Proceedings 1996 IEEE Symposium on Visual Languages.

[2]  A. Agresti An introduction to categorical data analysis , 1997 .

[3]  Daniel A. Keim,et al.  Information Visualization and Visual Data Mining , 2002, IEEE Trans. Vis. Comput. Graph..

[4]  A. Unwin,et al.  MANET Extensions to Interactive Statistical Graphics for Missing Values , 1997 .

[5]  Michael Spenke,et al.  Visualization of Trees as Highly Compressed Tables with InfoZoom , 2003 .

[6]  Ben Shneiderman,et al.  Readings in information visualization - using vision to think , 1999 .

[7]  Alfred Inselberg,et al.  Parallel coordinates: a tool for visualizing multi-dimensional geometry , 1990, Proceedings of the First IEEE Conference on Visualization: Visualization `90.

[8]  Helwig Hauser,et al.  Angular brushing of extended parallel coordinates , 2002, IEEE Symposium on Information Visualization, 2002. INFOVIS 2002..

[9]  Dominique Brodbeck,et al.  Visualization of large-scale customer satisfaction surveys using a parallel coordinate tree , 2003, IEEE Symposium on Information Visualization 2003 (IEEE Cat. No.03TH8714).

[10]  Tom Lanning,et al.  Parallel bargrams for consumer-based information exploration and choice , 2001, UIST '01.

[11]  Matthew O. Ward,et al.  Exploring N-dimensional databases , 1990, Proceedings of the First IEEE Conference on Visualization: Visualization `90.

[12]  Matthew O. Ward,et al.  Visual Hierarchical Dimension Reduction for Exploration of High Dimensional Datasets , 2003, VisSym.

[13]  Kwan-Liu Ma,et al.  PaintingClass: interactive construction, visualization and exploration of decision trees , 2003, KDD '03.

[14]  Michael Friendly,et al.  Visualizing Categorical Data: Data, Stories, and Pictures , 2000 .

[15]  Ben Shneiderman,et al.  Using vision to think , 1999 .

[16]  John T. Stasko,et al.  The information mural: a technique for displaying and navigating large information spaces , 1995, Proceedings of Visualization 1995 Conference.

[17]  Matthew O. Ward,et al.  Value and Relation Display for Interactive Exploration of High Dimensional Datasets , 2004, IEEE Symposium on Information Visualization.

[18]  R. Kosara,et al.  Parallel sets: visual analysis of categorical data , 2005, IEEE Symposium on Information Visualization, 2005. INFOVIS 2005..

[19]  Lucy T. Nowell,et al.  Change blindness in information visualization: a case study , 2001, IEEE Symposium on Information Visualization, 2001. INFOVIS 2001..

[20]  Pavel Pudil,et al.  Introduction to Statistical Pattern Recognition , 2006 .

[21]  H. Hofmann Exploring categorical data: interactive mosaic plots , 2000 .

[22]  Matthew O. Ward,et al.  Mapping Nominal Values to Numbers for Effective Visualization , 2003, IEEE Symposium on Information Visualization 2003 (IEEE Cat. No.03TH8714).

[23]  Kristin A. Cook,et al.  Illuminating the Path: The Research and Development Agenda for Visual Analytics , 2005 .

[24]  J. A. Hartigan,et al.  Mosaics for Contingency Tables , 1981 .