A Task Based Performance Evaluation of Visualization Approaches for Categorical Data Analysis

Categorical data is common within many areas and efficient methods for analysis are needed. It is, however, often difficult to analyse categorical data since no general measure of similarity exists. One approach is to represent the categories with numerical values (quantification) prior to visualization using methods for numerical data. Another is to use visual representations specifically designed for categorical data. Although commonly used, very little guidance is available as to which method may be most useful for different analysis tasks. This paper presents an evaluation comparing the performance of employing quantification prior to visualization and visualization using a method designed for categorical data. It also provides a guidance as to which visualization approach is most useful in the context of two basic data analysis tasks: one related to similarity structures and one related to category frequency. The results strongly indicate that the quantification approach is most efficient for the similarity related task, whereas the visual representation designed for categorical data is most efficient for the task related to category frequency.

[1]  G. Upton Cobweb Diagrams for Multiway Contingency Tables , 2000 .

[2]  Matthew O. Ward,et al.  Mapping Nominal Values to Numbers for Effective Visualization , 2003, IEEE Symposium on Information Visualization 2003 (IEEE Cat. No.03TH8714).

[3]  Mao Lin Huang,et al.  TreemapBar: Visualizing Additional Dimensions of Data in Bar Chart , 2009, 2009 13th International Conference Information Visualisation.

[4]  John T. Stasko,et al.  An evaluation of space-filling information visualizations for depicting hierarchical structures , 2000, Int. J. Hum. Comput. Stud..

[5]  Michel Tenenhaus,et al.  An analysis and synthesis of multiple correspondence analysis, optimal scaling, dual scaling, homogeneity analysis and other methods for quantifying categorical multivariate data , 1985 .

[6]  Catherine Plaisant,et al.  The challenge of information visualization evaluation , 2004, AVI.

[7]  Richard A. Becker,et al.  Brushing scatterplots , 1987 .

[8]  Mats Lind,et al.  Perceiving Patterns in Parallel Coordinates: Determining Thresholds for Identification of Relationships , 2008, Inf. Vis..

[9]  T. J. Watson,et al.  Ordering Categorical Data to Improve VisualizationSheng , 1999 .

[10]  Ramana Rao,et al.  The table lens: merging graphical and symbolic representations in an interactive focus + context visualization for tabular information , 1994, CHI '94.

[11]  Christian Posse,et al.  Diverse information integration and visualization , 2006, Electronic Imaging.

[12]  Michael Greenacre,et al.  A Comparison of Different Methods for Representing Categorical Data , 2006 .

[13]  Helwig Hauser,et al.  Parallel Sets: interactive exploration and visual analysis of categorical data , 2006, IEEE Transactions on Visualization and Computer Graphics.

[14]  John T. Stasko,et al.  SellTrend: Inter-Attribute Visual Analysis of Temporal Transaction Data , 2009, IEEE Transactions on Visualization and Computer Graphics.

[15]  M. Greenacre Correspondence analysis in practice , 1993 .

[16]  Catherine Plaisant,et al.  SpaceTree: supporting exploration in large node link tree, design evolution and empirical evaluation , 2002, IEEE Symposium on Information Visualization, 2002. INFOVIS 2002..

[17]  Michael Friendly,et al.  Visualizing Categorical Data: Data, Stories, and Pictures , 2000 .

[18]  Jimmy Johansson,et al.  Interactive Quantification of Categorical Variables in Mixed Data Sets , 2008, 2008 12th International Conference Information Visualisation.

[19]  M. F. Fuller,et al.  Practical Nonparametric Statistics; Nonparametric Statistical Inference , 1973 .

[20]  Camilla Forsell,et al.  2D and 3D Representations for Feature Recognition in Time Geographical Diary Data , 2010, Inf. Vis..

[21]  Jun Sun,et al.  R-Map: Mapping Categorical Data for Clustering and Visualization Based on Reference Sets , 2008, PAKDD.

[22]  Alfred Inselberg,et al.  The plane with parallel coordinates , 1985, The Visual Computer.