Combining automated analysis and visualization techniques for effective exploration of high-dimensional data

Visual exploration of multivariate data typically requires projection onto lower-dimensional representations. The number of possible representations grows rapidly with the number of dimensions, and manual exploration quickly becomes ineffective or even unfeasible. This paper proposes automatic analysis methods to extract potentially relevant visual structures from a set of candidate visualizations. Based on features, the visualizations are ranked in accordance with a specified user task. The user is provided with a manageable number of potentially useful candidate visualizations, which can be used as a starting point for interactive data analysis. This can effectively ease the task of finding truly useful visualizations and potentially speed up the data exploration task. In this paper, we present ranking measures for class-based as well as non class-based Scatterplots and Parallel Coordinates visualizations. The proposed analysis methods are evaluated on different datasets.

[1]  Matthew O. Ward,et al.  XmdvTool: integrating multiple methods for visualizing multivariate data , 1994, Proceedings Visualization '94.

[2]  Daniel A. Keim,et al.  43 Visual Data-Mining Techniques* , 2004 .

[3]  Stefan Berchtold,et al.  Similarity clustering of dimensions for an enhanced visualization of multidimensional data , 1998, Proceedings IEEE Symposium on Information Visualization (Cat. No.98TB100258).

[4]  John W. Tukey,et al.  PRIM-9: An Interactive Multi-dimensional Data Display and Analysis System , 1975, ACM Pacific.

[5]  S. P. Lloyd,et al.  Least squares quantization in PCM , 1982, IEEE Trans. Inf. Theory.

[6]  Johann Gasteiger,et al.  Classification of multicomponent analytical data of olive oils using different neural networks , 1994 .

[7]  Matthew O. Ward,et al.  Visual Hierarchical Dimension Reduction for Exploration of High Dimensional Datasets , 2003, VisSym.

[8]  Max A. Little,et al.  Suitability of Dysphonia Measurements for Telemonitoring of Parkinson's Disease , 2008, IEEE Transactions on Biomedical Engineering.

[9]  Jin Chen,et al.  A Visualization System for Space-Time and Multivariate Patterns (VIS-STAMP) , 2006, IEEE Transactions on Visualization and Computer Graphics.

[10]  William J. Cook,et al.  The Traveling Salesman Problem: A Computational Study (Princeton Series in Applied Mathematics) , 2007 .

[11]  Daniel A. Keim,et al.  Pixnostics: Towards Measuring the Value of Visualization , 2006, 2006 IEEE Symposium On Visual Analytics Science And Technology.

[12]  Olvi L. Mangasarian,et al.  Nuclear feature extraction for breast tumor diagnosis , 1993, Electronic Imaging.

[13]  John W. Tukey,et al.  A Projection Pursuit Algorithm for Exploratory Data Analysis , 1974, IEEE Transactions on Computers.

[14]  Daniel Asimov,et al.  The grand tour: a tool for viewing multidimensional data , 1985 .

[15]  Yehuda Koren,et al.  Visualization of labeled data using linear transformations , 2003, IEEE Symposium on Information Visualization 2003 (IEEE Cat. No.03TH8714).

[16]  William J. Cook,et al.  The Traveling Salesman Problem: A Computational Study , 2007 .

[17]  Robin Sibson,et al.  What is projection pursuit , 1987 .

[18]  Nils J. Nilsson,et al.  A Formal Basis for the Heuristic Determination of Minimum Cost Paths , 1968, IEEE Trans. Syst. Sci. Cybern..

[19]  R. Grossman,et al.  Graph-theoretic scagnostics , 2005, IEEE Symposium on Information Visualization, 2005. INFOVIS 2005..

[20]  Andreas Buja,et al.  Grand tour and projection pursuit , 1995 .

[21]  Eugene L. Lawler,et al.  Traveling Salesman Problem , 2016 .

[22]  Alfred Inselberg,et al.  The plane with parallel coordinates , 1985, The Visual Computer.

[23]  Max A. Little,et al.  Exploiting Nonlinear Recurrence and Fractal Scaling Properties for Voice Disorder Detection , 2007, Biomedical engineering online.

[24]  Max A. Little,et al.  Exploiting Nonlinear Recurrence and Fractal Scaling Properties for Voice Disorder Detection , 2007 .

[25]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[26]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[27]  Dong Li-yan Visual Data Mining Techniques , 2006 .