Brushing Dimensions - A Dual Visual Analysis Model for High-Dimensional Data

In many application fields, data analysts have to deal with datasets that contain many expressions per item. The effective analysis of such multivariate datasets is dependent on the user's ability to understand both the intrinsic dimensionality of the dataset as well as the distribution of the dependent values with respect to the dimensions. In this paper, we propose a visualization model that enables the joint interactive visual analysis of multivariate datasets with respect to their dimensions as well as with respect to the actual data values. We describe a dual setting of visualization and interaction in items space and in dimensions space. The visualization of items is linked to the visualization of dimensions with brushing and focus+context visualization. With this approach, the user is able to jointly study the structure of the dimensions space as well as the distribution of data items with respect to the dimensions. Even though the proposed visualization model is general, we demonstrate its application in the context of a DNA microarray data analysis.

[1]  Helwig Hauser,et al.  Interactive Feature Specification for Focus+Context Visualization of Complex Simulation Data , 2003, VisSym.

[2]  Richard A. Johnson,et al.  Applied Multivariate Statistical Analysis , 1983 .

[3]  Kenneth I. Joy,et al.  An Application of Multivariate Statistical Analysis for Query-Driven Visualization , 2011, IEEE Transactions on Visualization and Computer Graphics.

[4]  Pat Hanrahan,et al.  Polaris: a system for query, analysis and visualization of multi-dimensional relational databases , 2000, IEEE Symposium on Information Visualization 2000. INFOVIS 2000. Proceedings.

[5]  Ben Shneiderman,et al.  A Rank-by-Feature Framework for Unsupervised Multidimensional Data Exploration Using Low Dimensional Projections , 2004, IEEE Symposium on Information Visualization.

[6]  Matthew D. Cooper,et al.  Depth Cues and Density in Temporal Parallel Coordinates , 2007, EuroVis.

[7]  P. Filzmoser,et al.  Principal component analysis for compositional data with outliers , 2009 .

[8]  Dorian Pyle,et al.  Data Preparation for Data Mining , 1999 .

[9]  Tobias Schreck,et al.  Techniques for Precision-Based Visual Analysis of Projected Data , 2010, Inf. Vis..

[10]  Chris Weaver,et al.  Cross-Filtered Views for Multidimensional Visual Analysis , 2010, IEEE Transactions on Visualization and Computer Graphics.

[11]  Peter Filzmoser,et al.  Brushing Moments in Interactive Visual Analysis , 2010, Comput. Graph. Forum.

[12]  Peter Dalgaard,et al.  R Development Core Team (2010): R: A language and environment for statistical computing , 2010 .

[13]  Peter Filzmoser,et al.  Uncertainty‐Aware Exploration of Continuous Parameter Spaces Using Multivariate Prediction , 2011, Comput. Graph. Forum.

[14]  Balázs Kégl,et al.  Intrinsic Dimension Estimation Using Packing Numbers , 2002, NIPS.

[15]  Denis Gracanin,et al.  ComVis: A Coordinated Multiple Views System for Prototyping New Visualization Technology , 2008, 2008 12th International Conference Information Visualisation.

[16]  Pak Chung Wong,et al.  30 Years of Multidimensional Multivariate Visualization , 1994, Scientific Visualization.

[17]  D. Rubinfeld,et al.  Hedonic housing prices and the demand for clean air , 1978 .

[18]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[19]  Helwig Hauser,et al.  Visualization of Multi‐Variate Scientific Data , 2009, Comput. Graph. Forum.

[20]  Dimitrios Gunopulos,et al.  Automatic subspace clustering of high dimensional data for data mining applications , 1998, SIGMOD '98.

[21]  Alessio Farcomeni,et al.  An exact approach to sparse principal component analysis , 2009, Comput. Stat..

[22]  Raphael Fuchs,et al.  Visual Human+Machine Learning , 2009, IEEE Transactions on Visualization and Computer Graphics.

[23]  Matthew O. Ward,et al.  High Dimensional Brushing for Interactive Exploration of Multivariate Data , 1995, Proceedings Visualization '95.

[24]  Ben Shneiderman,et al.  The eyes have it: a task by data type taxonomy for information visualizations , 1996, Proceedings 1996 IEEE Symposium on Visual Languages.

[25]  Helwig Hauser,et al.  Visual Analysis of Multivariate Movement Data using Interactive Difference Views , 2010, VMV.

[26]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[27]  Peter Filzmoser,et al.  Outlier identification in high dimensions , 2008, Comput. Stat. Data Anal..

[28]  William Ribarsky,et al.  iPCA: An Interactive System for PCA‐based Visual Analytics , 2009, Comput. Graph. Forum.

[29]  Matthew O. Ward,et al.  Value and Relation Display: Interactive Visual Exploration of Large Data Sets with Hundreds of Dimensions , 2007, IEEE Trans. Vis. Comput. Graph..

[30]  Matthew O. Ward,et al.  XmdvTool: integrating multiple methods for visualizing multivariate data , 1994, Proceedings Visualization '94.

[31]  S. Dudoit,et al.  Comparison of Discrimination Methods for the Classification of Tumors Using Gene Expression Data , 2002 .

[32]  Matthew O. Ward,et al.  Visual Hierarchical Dimension Reduction for Exploration of High Dimensional Datasets , 2003, VisSym.

[33]  Matthew O. Ward,et al.  Model space visualization for multivariate linear trend discovery , 2009, 2009 IEEE Symposium on Visual Analytics Science and Technology.

[34]  Ingo Hotz,et al.  iPCA : An Interactive System for PCA-based Visual Analytics , 2008 .

[35]  Daniel A. Keim,et al.  Visual Analytics: Scope and Challenges , 2008, Visual Data Mining.

[36]  Kwan-Liu Ma,et al.  A framework for uncertainty-aware visual analytics , 2009, 2009 IEEE Symposium on Visual Analytics Science and Technology.

[37]  Gerik Scheuermann,et al.  Brushing of Attribute Clouds for the Visualization of Multivariate Data , 2008, IEEE Transactions on Visualization and Computer Graphics.

[38]  Daniel A. Keim,et al.  Space‐in‐Time and Time‐in‐Space Self‐Organizing Maps for Exploring Spatiotemporal Patterns , 2010, Comput. Graph. Forum.

[39]  Tamara Munzner,et al.  Steerable, Progressive Multidimensional Scaling , 2004, IEEE Symposium on Information Visualization.