Using multiple attribute-based explanations of multidimensional projections to explore high-dimensional data

Abstract Multidimensional projections (MPs) are effective methods for visualizing high-dimensional datasets to find structures in the data like groups of similar points and outliers. The insights obtained from MPs can be amplified by complementing these techniques by several so-called explanatory mechanisms. We present and discuss a set of six such mechanisms that explain MPs in terms of similar dimensions, local dimensionality, and dimension correlations. We implement our explanatory tools using an image-based approach, which is efficient to compute, scales well visually for large and dense MP scatterplots, and can handle any projection technique. We demonstrate how the provided explanatory views can be combined to augment each other’s value and thereby lead to refined insights in the data for several high-dimensional datasets, and how these insights correlate with known facts about the data under study.

[1]  Andreas Kerren,et al.  Toward a Quantitative Survey of Dimension Reduction Techniques , 2019, IEEE Transactions on Visualization and Computer Graphics.

[2]  Pierre Dragicevic,et al.  Rolling the Dice: Multidimensional Visual Exploration using Scatterplot Matrix Navigation , 2008, IEEE Transactions on Visualization and Computer Graphics.

[3]  Rosane Minghim,et al.  Visual analysis of dimensionality reduction quality for parameterized projections , 2014, Comput. Graph..

[4]  Barbara Hammer,et al.  Visualizing the quality of dimensionality reduction , 2013, ESANN.

[5]  John T. Stasko,et al.  Dust & Magnet: Multivariate Information Visualization Using a Magnet Metaphor , 2005, Inf. Vis..

[6]  Robert F. Cahalan,et al.  Sampling Errors in the Estimation of Empirical Orthogonal Functions , 1982 .

[7]  Michel Verleysen,et al.  Quality assessment of dimensionality reduction: Rank-based criteria , 2009, Neurocomputing.

[8]  Rosane Minghim,et al.  Perception-Based Evaluation of Projection Methods for Multidimensional Data Visualization , 2015, IEEE Transactions on Visualization and Computer Graphics.

[9]  Fabio Kon,et al.  A Study of the Relationships between Source Code Metrics and Attractiveness in Free Software Projects , 2010, 2010 Brazilian Symposium on Software Engineering.

[10]  Chris North,et al.  Semantic interaction for visual text analytics , 2012, CHI.

[11]  Haim Levkowitz,et al.  Least Square Projection: A Fast High-Precision Multidimensional Projection Technique and Its Application to Document Mapping , 2008, IEEE Transactions on Visualization and Computer Graphics.

[12]  Jarkko Venna,et al.  Visualizing gene interaction graphs with local multidimensional scaling , 2006, ESANN.

[13]  Luis Gustavo Nonato,et al.  Multidimensional Projection for Visual Analytics: Linking Techniques with Distortions, Tasks, and Layout Enrichment , 2019, IEEE Transactions on Visualization and Computer Graphics.

[14]  Haim Levkowitz,et al.  Projection inspector: Assessment and synthesis of multidimensional projections , 2015, Neurocomputing.

[15]  Robert R. Korfhage,et al.  Visualization of a Document Collection: The VIBE System , 1993, Inf. Process. Manag..

[16]  Alexandru Telea,et al.  Constructing and Visualizing High-Quality Classifier Decision Boundary Maps , 2019, Inf..

[17]  Luis Gustavo Nonato,et al.  Local Affine Multidimensional Projection , 2011, IEEE Transactions on Visualization and Computer Graphics.

[18]  Michaël Aupetit,et al.  Visualizing distortions and recovering topology in continuous projection techniques , 2007, Neurocomputing.

[19]  Bernhard Preim,et al.  Interactive Visual Analysis of Perfusion Data , 2007, IEEE Transactions on Visualization and Computer Graphics.

[20]  Paulo Cortez,et al.  Modeling wine preferences by data mining from physicochemical properties , 2009, Decis. Support Syst..

[21]  D. van Driel,et al.  Enhanced Attribute-Based Explanations of Multidimensional Projections , 2020, EuroVA@Eurographics/EuroVis.

[22]  John P. Lewis,et al.  Eurographics/ Ieee-vgtc Symposium on Visualization 2009 Selecting Good Views of High-dimensional Data Using Class Consistency , 2022 .

[23]  S.S. Wu,et al.  Predictive modeling of high-performance concrete with regression analysis , 2010, 2010 IEEE International Conference on Industrial Engineering and Engineering Management.

[24]  Leland McInnes,et al.  UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction , 2018, ArXiv.

[25]  Rosane Minghim,et al.  Attribute-based Visual Explanation of Multidimensional Projections , 2015, EuroVA@EuroVis.

[26]  Jing He,et al.  Cautionary tales on air-quality improvement in Beijing , 2017, Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[27]  Norman Cliff,et al.  The eigenvalues-greater-than-one rule and the reliability of components. , 1988 .

[28]  Michael Greenacre,et al.  Biplots in Practice , 2009 .

[29]  Joshua B. Tenenbaum,et al.  Sparse multidimensional scaling using land-mark points , 2004 .

[30]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[31]  Daniel A. Keim,et al.  Visual quality metrics and human perception: an initial study on 2D projections of large multidimensional data , 2010, AVI.

[32]  Ph. Besse,et al.  Application of Resampling Methods to the Choice of Dimension in Principal Component Analysis , 1993 .

[33]  I-Cheng Yeh,et al.  Modeling of strength of high-performance concrete using artificial neural networks , 1998 .

[34]  Alexandru Telea,et al.  Interactive Image Feature Selection Aided by Dimensionality Reduction , 2015, EuroVA@EuroVis.

[35]  Renato Rodrigues Oliveira da Silva,et al.  Visualizing multidimensional data similarities : Improvements and applications , 2016 .

[36]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[37]  Charles Richter Designing Flexible Object-Oriented Systems with UML , 1999 .

[38]  R. Grossman,et al.  Graph-theoretic scagnostics , 2005, IEEE Symposium on Information Visualization, 2005. INFOVIS 2005..

[39]  Alexandru Telea,et al.  Visual Analysis of Multi‐Dimensional Categorical Data Sets , 2013, Comput. Graph. Forum.

[40]  Zhi-Hua Zhou,et al.  Supervised nonlinear dimensionality reduction for visualization and classification , 2005, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[41]  C. Westin,et al.  An introduction to diffusion tensor image analysis. , 2011, Neurosurgery clinics of North America.

[42]  E. Massera,et al.  On field calibration of an electronic nose for benzene estimation in an urban pollution monitoring scenario , 2008 .

[43]  E. Beh,et al.  A Visual Evaluation of a Classification Method for Investigating the Physicochemical Properties of Portuguese Wine , 2012 .

[44]  Alexandru Telea,et al.  Enridged contour maps , 2001, Proceedings Visualization, 2001. VIS '01..

[45]  Michaël Aupetit,et al.  CheckViz: Sanity Check and Topological Clues for Linear and Non‐Linear Mappings , 2011, Comput. Graph. Forum.

[46]  Heng Tao Shen,et al.  Principal Component Analysis , 2009, Encyclopedia of Biometrics.

[47]  Alexandru Telea,et al.  Explaining three-dimensional dimensionality reduction plots , 2016, Inf. Vis..

[48]  Tobias Schreck,et al.  Techniques for Precision-Based Visual Analysis of Projected Data , 2010, Inf. Vis..