Measuring Data Abstraction Quality in Multiresolution Visualizations

Data abstraction techniques are widely used in multiresolution visualization systems to reduce visual clutter and facilitate analysis from overview to detail. However, analysts are usually unaware of how well the abstracted data represent the original dataset, which can impact the reliability of results gleaned from the abstractions. In this paper, we define two data abstraction quality measures for computing the degree to which the abstraction conveys the original dataset: the histogram difference measure and the nearest neighbor measure. They have been integrated within XmdvTool, a public-domain multiresolution visualization system for multivariate data analysis that supports sampling as well as clustering to simplify data. Several interactive operations are provided, including adjusting the data abstraction level, changing selected regions, and setting the acceptable data abstraction quality level. Conducting these operations, analysts can select an optimal data abstraction level. Also, analysts can compare different abstraction methods using the measures to see how well relative data density and outliers are maintained, and then select an abstraction method that meets the requirement of their analytic tasks

[1]  Yuanzhen Li,et al.  Feature congestion: a measure of display clutter , 2005, CHI.

[2]  Alfred Inselberg,et al.  Parallel coordinates for visualizing multi-dimensional geometry , 1987 .

[3]  Stephen Curial,et al.  Effectively visualizing large networks through sampling , 2005, VIS 05. IEEE Visualization, 2005..

[4]  Sudipto Guha,et al.  CURE: an efficient clustering algorithm for large databases , 1998, SIGMOD '98.

[5]  Ben Shneiderman,et al.  Visual information seeking using the FilmFinder , 1994, CHI Conference Companion.

[6]  Tian Zhang,et al.  BIRCH: an efficient data clustering method for very large databases , 1996, SIGMOD '96.

[7]  Christos Faloutsos,et al.  QBIC project: querying images by content, using color, texture, and shape , 1993, Electronic Imaging.

[8]  Pavel Berkhin,et al.  A Survey of Clustering Data Mining Techniques , 2006, Grouping Multidimensional Data.

[9]  Heidrun Schumann,et al.  A scalable framework for information visualization , 2000, IEEE Symposium on Information Visualization 2000. INFOVIS 2000. Proceedings.

[10]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[11]  D. W. Scott On optimal and data based histograms , 1979 .

[12]  Matthew O. Ward,et al.  Clutter Reduction in Multi-Dimensional Data Visualization Using Dimension Reordering , 2004 .

[13]  Matthew O. Ward,et al.  XmdvTool: integrating multiple methods for visualizing multivariate data , 1994, Proceedings Visualization '94.

[14]  Matthew O. Ward,et al.  Exploratory Visualization of Multivariate Data with Variable Quality , 2006, 2006 IEEE Symposium On Visual Analytics Science And Technology.

[15]  E. Wegman Hyperdimensional Data Analysis Using Parallel Coordinates , 1990 .

[16]  Matthew O. Ward,et al.  Exploring N-dimensional databases , 1990, Proceedings of the First IEEE Conference on Visualization: Visualization `90.

[17]  Matthew O. Ward,et al.  Hierarchical exploration of large multivariate data sets , 2003, Data Visualization: The State of the Art.

[18]  Viswanath Poosala,et al.  Congressional samples for approximate answering of group-by queries , 2000, SIGMOD '00.

[19]  Surajit Chaudhuri,et al.  Dynamic sample selection for approximate query processing , 2003, SIGMOD '03.

[20]  Matthew O. Ward,et al.  Interactive hierarchical displays: a general framework for visualization and exploration of large multivariate data sets , 2003, Comput. Graph..

[21]  P. Fayers,et al.  The Visual Display of Quantitative Information , 1990 .

[22]  Matthew O. Ward,et al.  High Dimensional Brushing for Interactive Exploration of Multivariate Data , 1995, Proceedings Visualization '95.

[23]  Mark D. Apperley,et al.  A review and taxonomy of distortion-oriented presentation techniques , 1994, TCHI.

[24]  Michael J. Swain,et al.  Color indexing , 1991, International Journal of Computer Vision.

[25]  Wolfgang Kienreich,et al.  Evaluating a System for Interactive Exploration of Large, Hierarchically Structured Document Repositories , 2004 .

[26]  Alfred Inselberg,et al.  Parallel coordinates: a tool for visualizing multi-dimensional geometry , 1990, Proceedings of the First IEEE Conference on Visualization: Visualization `90.

[27]  James D. Hollan,et al.  Pad++: advances in multiscale interfaces , 1994, CHI Conference Companion.

[28]  Giuseppe Santucci,et al.  Quality Metrics for 2D Scatterplot Graphics: Automatically Reducing Visual Clutter , 2004, Smart Graphics.

[29]  Touradj Ebrahimi,et al.  A study of JPEG 2000 still image coding versus other standards , 2000, 2000 10th European Signal Processing Conference.

[30]  Anil K. Jain,et al.  Algorithms for Clustering Data , 1988 .

[31]  Matthew O. Ward,et al.  Clutter Reduction in Multi-Dimensional Data Visualization Using Dimension Reordering , 2004, IEEE Symposium on Information Visualization.

[32]  F. Boutin,et al.  Cluster validity indices for graph partitioning , 2004 .

[33]  David G. Stork,et al.  Pattern Classification , 1973 .

[34]  B. Marx The Visual Display of Quantitative Information , 1985 .

[35]  Matthias Zwicker,et al.  Ieee Transactions on Visualization and Computer Graphics Ewa Splatting , 2002 .

[36]  Sven Siggelkow,et al.  Feature histograms for content-based image retrieval , 2002 .

[37]  Giuseppe Santucci,et al.  By chance is not enough: preserving relative density through nonuniform sampling , 2004, Proceedings. Eighth International Conference on Information Visualisation, 2004. IV 2004..

[38]  D. F. Andrews,et al.  PLOTS OF HIGH-DIMENSIONAL DATA , 1972 .

[39]  Sougata Mukherjea,et al.  Glyphmaker: creating customized visualizations of complex data , 1994, Computer.

[40]  Matthew O. Ward,et al.  Structure-Based Brushes: A Mechanism for Navigating Hierarchically Organized Data and Information Spaces , 2000, IEEE Trans. Vis. Comput. Graph..

[41]  Gregory M. Nielson,et al.  Data Visualization: The State of the Art , 2003, Data Visualization: The State of the Art.

[42]  Alan J. Dix,et al.  by chance enhancing interaction with large data sets through statistical sampling , 2002, AVI '02.

[43]  Eve A. Riskin,et al.  Optimal bit allocation via the generalized BFOS algorithm , 1991, IEEE Trans. Inf. Theory.