Visualnostics: Visual Guidance Pictograms for Analyzing Projections of High‐dimensional Data

The visual analysis of multivariate projections is a challenging task, because complex visual structures occur. This causes fatigue or misinterpretations, which distorts the analysis. In fact, the same projection can lead to different analysis results. We provide visual guidance pictograms to improve objectivity of the visual search. A visual guidance pictogram is an iconic visual density map encoding the visual structure of certain data properties. By using them to guide the analysis, structures in the projection can be better understood and mentally linked to properties in the data. We introduce a systematic scheme for designing such pictograms and provide a set of pictograms for standard visual tasks, such as correlation and distribution analysis, for standard projections like scatterplots, RadVis, and Star Coordinates. We conduct a study that compares the visual analysis of real data with and without the support of guidance pictograms. Our tests show that the training effort for a visual search can be decreased and the analysis bias can be reduced by supporting the user's visual search with guidance pictograms.

[1]  Shang-Hong Lai,et al.  Efficient Normalized Cross Correlation Based on Adaptive Multilevel Successive Elimination , 2007, ACCV.

[2]  Christopher G. Healey,et al.  Choosing effective colours for data visualization , 1996, Proceedings of Seventh Annual IEEE Visualization '96.

[3]  Tamara Munzner,et al.  A Taxonomy of Visual Cluster Separation Factors , 2012, Comput. Graph. Forum.

[4]  Marcus A. Magnor,et al.  Quality-Based Visualization Matrices , 2009, VMV.

[5]  Marcus A. Magnor,et al.  Perception-based visual quality measures , 2011, 2011 IEEE Conference on Visual Analytics Science and Technology (VAST).

[6]  A. Asuncion,et al.  UCI Machine Learning Repository, University of California, Irvine, School of Information and Computer Sciences , 2007 .

[7]  Georges G. Grinstein,et al.  Properties of normalized radial visualizations , 2012, Inf. Vis..

[8]  John P. Lewis,et al.  Eurographics/ Ieee-vgtc Symposium on Visualization 2009 Selecting Good Views of High-dimensional Data Using Class Consistency , 2022 .

[9]  Wei Wang,et al.  Finding High-Order Correlations in High-Dimensional Biological Data , 2010, Link Mining.

[10]  Joshua M. Lewis,et al.  Human Cluster Evaluation and Formal Quality Measures: A Comparative Study , 2012, CogSci.

[11]  Takuji Nishimura,et al.  Mersenne twister: a 623-dimensionally equidistributed uniform pseudo-random number generator , 1998, TOMC.

[12]  Michael Gleicher,et al.  Splatterplots: Overcoming Overdraw in Scatter Plots , 2013, IEEE Transactions on Visualization and Computer Graphics.

[13]  Daniel A. Keim,et al.  Guided Sketching for Visual Search and Exploration in Large Scatter Plot Spaces , 2014, EuroVA@EuroVis.

[14]  M. Sheelagh T. Carpendale,et al.  Empirical Studies in Information Visualization: Seven Scenarios , 2012, IEEE Transactions on Visualization and Computer Graphics.

[15]  Robert Kosara,et al.  Pargnostics: Screen-Space Metrics for Parallel Coordinates , 2010, IEEE Transactions on Visualization and Computer Graphics.

[16]  Georges G. Grinstein,et al.  DNA visual and analytic data mining , 1997, Proceedings. Visualization '97 (Cat. No. 97CB36155).

[17]  Daniel A. Keim,et al.  Visual quality metrics and human perception: an initial study on 2D projections of large multidimensional data , 2010, AVI.

[18]  Tobias Schreck,et al.  Retrieval and exploratory search in multivariate research data repositories using regressional features , 2011, JCDL '11.

[19]  Tamara Munzner,et al.  DimStiller: Workflows for dimensional analysis and reduction , 2010, 2010 IEEE Symposium on Visual Analytics Science and Technology.

[20]  Marcus A. Magnor,et al.  Synthetic Generation of High-Dimensional Datasets , 2011, IEEE Transactions on Visualization and Computer Graphics.

[21]  Show-Li Jan,et al.  Sample size determinations for Welch's test in one-way heteroscedastic ANOVA. , 2014, The British journal of mathematical and statistical psychology.

[22]  Marcus A. Magnor,et al.  Improving the visual analysis of high-dimensional datasets using quality measures , 2010, 2010 IEEE Symposium on Visual Analytics Science and Technology.

[23]  Alfred Inselberg,et al.  The plane with parallel coordinates , 1985, The Visual Computer.

[24]  Dirk J. Lehmann,et al.  Orthographic Star Coordinates , 2013, IEEE Transactions on Visualization and Computer Graphics.

[25]  Dirk J. Lehmann,et al.  Features in Continuous Parallel Coordinates , 2011, IEEE Transactions on Visualization and Computer Graphics.

[26]  Holger Theisel Higher Order Parallel Coordinates , 2000, VMV.

[27]  Enrico Bertini,et al.  Quality Metrics in High-Dimensional Data Visualization: An Overview and Systematization , 2011, IEEE Transactions on Visualization and Computer Graphics.

[28]  Marcus A. Magnor,et al.  Combining automated analysis and visualization techniques for effective exploration of high-dimensional data , 2009, 2009 IEEE Symposium on Visual Analytics Science and Technology.

[29]  M. Kendall,et al.  Rank Correlation Methods , 1949 .

[30]  Daniel B. Carr,et al.  Scatterplot matrix techniques for large N , 1986 .

[31]  Olga Stepánková,et al.  Visualization of trends using RadViz , 2011, Journal of Intelligent Information Systems.

[32]  R. Grossman,et al.  Graph-theoretic scagnostics , 2005, IEEE Symposium on Information Visualization, 2005. INFOVIS 2005..

[33]  Luigi Di Caro,et al.  Analyzing the Role of Dimension Arrangement for Data Visualization in Radviz , 2010, PAKDD.

[34]  Marcus A. Magnor,et al.  Selecting Coherent and Relevant Plots in Large Scatterplot Matrices , 2012, Comput. Graph. Forum.

[35]  Jonathan Goldstein,et al.  When Is ''Nearest Neighbor'' Meaningful? , 1999, ICDT.

[36]  Lenka Nováková,et al.  Multidimensional clusters in RadViz , 2006 .

[37]  M. E. Muller,et al.  A Note on the Generation of Random Normal Deviates , 1958 .

[38]  Eser Kandogan Star Coordinates: A Multi-dimensional Visualization Technique with Uniform Treatment of Dimensions , 2000 .

[39]  Paul Horton,et al.  A Probabilistic Classification System for Predicting the Cellular Localization Sites of Proteins , 1996, ISMB.

[40]  Dimitrios Gunopulos,et al.  Automatic subspace clustering of high dimensional data for data mining applications , 1998, SIGMOD '98.