Automated Analytical Methods to Support Visual Exploration of High-Dimensional Data

Visual exploration of multivariate data typically requires projection onto lower dimensional representations. The number of possible representations grows rapidly with the number of dimensions, and manual exploration quickly becomes ineffective or even unfeasible. This paper proposes automatic analysis methods to extract potentially relevant visual structures from a set of candidate visualizations. Based on features, the visualizations are ranked in accordance with a specified user task. The user is provided with a manageable number of potentially useful candidate visualizations, which can be used as a starting point for interactive data analysis. This can effectively ease the task of finding truly useful visualizations and potentially speed up the data exploration task. In this paper, we present ranking measures for class-based as well as non-class-based scatterplots and parallel coordinates visualizations. The proposed analysis methods are evaluated on different data sets.

[1]  John P. Lewis,et al.  Eurographics/ Ieee-vgtc Symposium on Visualization 2009 Selecting Good Views of High-dimensional Data Using Class Consistency , 2022 .

[2]  Max A. Little,et al.  Exploiting Nonlinear Recurrence and Fractal Scaling Properties for Voice Disorder Detection , 2007, Biomedical engineering online.

[3]  Olvi L. Mangasarian,et al.  Nuclear feature extraction for breast tumor diagnosis , 1993, Electronic Imaging.

[4]  Nils J. Nilsson,et al.  A Formal Basis for the Heuristic Determination of Minimum Cost Paths , 1968, IEEE Trans. Syst. Sci. Cybern..

[5]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[6]  Daniel Asimov,et al.  The grand tour: a tool for viewing multidimensional data , 1985 .

[7]  Yehuda Koren,et al.  Visualization of labeled data using linear transformations , 2003, IEEE Symposium on Information Visualization 2003 (IEEE Cat. No.03TH8714).

[8]  Daniel A. Keim,et al.  43 Visual Data-Mining Techniques* , 2004 .

[9]  Stefan Berchtold,et al.  Similarity clustering of dimensions for an enhanced visualization of multidimensional data , 1998, Proceedings IEEE Symposium on Information Visualization (Cat. No.98TB100258).

[10]  Robert L. Grossman,et al.  Graph-Theoretic Scagnostics , 2005, INFOVIS.

[11]  Johann Gasteiger,et al.  Classification of multicomponent analytical data of olive oils using different neural networks , 1994 .

[12]  Matthew O. Ward,et al.  XmdvTool: integrating multiple methods for visualizing multivariate data , 1994, Proceedings Visualization '94.

[13]  John W. Tukey,et al.  A Projection Pursuit Algorithm for Exploratory Data Analysis , 1974, IEEE Transactions on Computers.

[14]  Marcus A. Magnor,et al.  Combining automated analysis and visualization techniques for effective exploration of high-dimensional data , 2009, 2009 IEEE Symposium on Visual Analytics Science and Technology.

[15]  Andreas Buja,et al.  Grand tour and projection pursuit , 1995 .

[16]  Alfred Inselberg,et al.  The plane with parallel coordinates , 1985, The Visual Computer.

[17]  Robin Sibson,et al.  What is projection pursuit , 1987 .

[18]  Richard A. Johnson,et al.  Applied Multivariate Statistical Analysis , 1983 .

[19]  William J. Cook,et al.  The Traveling Salesman Problem: A Computational Study (Princeton Series in Applied Mathematics) , 2007 .

[20]  S. Johansson,et al.  Interactive Dimensionality Reduction Through User-defined Combinations of Quality Metrics , 2009, IEEE Transactions on Visualization and Computer Graphics.

[21]  Diansheng Guo,et al.  Coordinating Computational and Visual Approaches for Interactive Feature Selection and Multivariate Clustering , 2003, Inf. Vis..

[22]  Max A. Little,et al.  Suitability of Dysphonia Measurements for Telemonitoring of Parkinson's Disease , 2008, IEEE Transactions on Biomedical Engineering.

[23]  Matthew O. Ward,et al.  Visual Hierarchical Dimension Reduction for Exploration of High Dimensional Datasets , 2003, VisSym.

[24]  Jin Chen,et al.  A Visualization System for Space-Time and Multivariate Patterns (VIS-STAMP) , 2006, IEEE Transactions on Visualization and Computer Graphics.

[25]  Daniel A. Keim,et al.  Pixnostics: Towards Measuring the Value of Visualization , 2006, 2006 IEEE Symposium On Visual Analytics Science And Technology.

[26]  William J. Cook,et al.  The Traveling Salesman Problem: A Computational Study , 2007 .

[27]  David S. Ebert,et al.  Visualization and computer graphics , 2007 .

[28]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[29]  Dong Li-yan Visual Data Mining Techniques , 2006 .

[30]  John W. Tukey,et al.  PRIM-9: An Interactive Multi-dimensional Data Display and Analysis System , 1975, ACM Pacific.

[31]  S. P. Lloyd,et al.  Least squares quantization in PCM , 1982, IEEE Trans. Inf. Theory.