Two axes re-ordering methods in parallel coordinates plots

Visualization and interaction of multidimensional data are challenges in visual data analytics, which requires optimized solutions to integrate the display, exploration and analytical reasoning of data into one visual pipeline for human-centered data analysis and interpretation. Even though it is considered to be one of the most popular techniques for visualization and analysis of multidimensional data, parallel coordinate visualization is also suffered from the visual clutter problem as well as the computational complexity problem, same as other visualization methods in which visual clutter occurs where the volume of data needs to be visualized to be increasing. One straightforward way to address these problems is to change the ordering of axis to reach the minimal number of visual clutters. However, the optimization of the ordering of axes is actually a NP-complete problem. In this paper, two axes re-ordering methods are proposed in parallel coordinates visualization: (1) a contribution-based method and (2) a similarity-based method.The contribution-based re-ordering method is mainly based on the singular value decomposition (SVD) algorithm. It can not only provide users with the mathmetical theory for the selection of the first remarkable axis, but also help with visualizing detailed structure of the data according to the contribution of each data dimension. This approach reduces the computational complexity greatly in comparison with other re-ordering methods. A similarity-based re-ordering method is based on the combination of nonlinear correlation coefficient (NCC) and SVD algorithms. By using this approach, axes are re-ordered in line with the degree of similarities among them. It is much more rational, exact and systemic than other re-ordering methods, including those based on Pearson's correlation coefficient (PCC). Meanwhile, the paper also proposes a measurement of contribution rate of each dimension to reveal the property hidden in the dataset. At last, the rationale and effectiveness of these approaches are demonstrated through case studies. For example, the patterns of Smurf and Neptune attacks hidden in KDD 1999 dataset are visualized in parallel coordinates using contribution-based re-ordering method; NCC re-ordering method can enlarge the mean crossing angles and reduce the amount of polylines between the neighboring axes.

[1]  Michael Drmota,et al.  Precise minimax redundancy and regret , 2004, IEEE Transactions on Information Theory.

[2]  Matthew O. Ward,et al.  Visual Hierarchical Dimension Reduction for Exploration of High Dimensional Datasets , 2003, VisSym.

[3]  Marcus A. Magnor,et al.  Improving the visual analysis of high-dimensional datasets using quality measures , 2010, 2010 IEEE Symposium on Visual Analytics Science and Technology.

[4]  Marcus A. Magnor,et al.  Automated Analytical Methods to Support Visual Exploration of High-Dimensional Data , 2011, IEEE Transactions on Visualization and Computer Graphics.

[5]  Qiang Wang,et al.  Effects of statistical distribution on nonlinear correlation coefficient , 2011, 2011 IEEE International Instrumentation and Measurement Technology Conference.

[6]  Mark Zwolinski,et al.  Mutual Information Theory for Adaptive Mixture Models , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  K. Simek Properties of a Singular Value Decomposition Based Dynamical Model of Gene Expression Data , 2003 .

[8]  Weidong Huang,et al.  Exploring the Relative Importance of Number of Edge Crossings and Size of Crossing Angles: A Quantitative Perspective , 2011 .

[9]  Matsuda,et al.  Physical nature of higher-order mutual information: intrinsic correlations and frustration , 2000, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[10]  Richard A. Becker,et al.  Brushing scatterplots , 1987 .

[11]  E. Wegman Hyperdimensional Data Analysis Using Parallel Coordinates , 1990 .

[12]  Daniel A. Keim,et al.  Designing Pixel-Oriented Visualization Techniques: Theory and Applications , 2000, IEEE Trans. Vis. Comput. Graph..

[13]  Haim Levkowitz,et al.  Enhanced High Dimensional Data Visualization through Dimension Reduction and Attribute Arrangement , 2006, Tenth International Conference on Information Visualisation (IV'06).

[14]  Marcus A. Magnor,et al.  Combining automated analysis and visualization techniques for effective exploration of high-dimensional data , 2009, 2009 IEEE Symposium on Visual Analytics Science and Technology.

[15]  Han-Wei Shen,et al.  An Information-Aware Framework for Exploring Multivariate Data Sets , 2013, IEEE Transactions on Visualization and Computer Graphics.

[16]  Matthew O. Ward,et al.  Clutter Reduction in Multi-Dimensional Data Visualization Using Dimension Reordering , 2004, IEEE Symposium on Information Visualization.

[17]  Alfred Inselberg,et al.  The plane with parallel coordinates , 1985, The Visual Computer.

[18]  J. Rodgers,et al.  Thirteen ways to look at the correlation coefficient , 1988 .

[19]  Gene H. Golub,et al.  Matrix computations , 1983 .

[20]  S. Johansson,et al.  Interactive Dimensionality Reduction Through User-defined Combinations of Quality Metrics , 2009, IEEE Transactions on Visualization and Computer Graphics.

[21]  Q. Wang,et al.  A nonlinear correlation measure for multivariable data set , 2005 .

[22]  K. Ramachandran,et al.  Mathematical Statistics with Applications. , 1992 .

[23]  Catherine B. Hurley,et al.  Pairwise Display of High-Dimensional Information via Eulerian Tours and Hamiltonian Decompositions , 2010 .

[24]  Stefan Berchtold,et al.  Similarity clustering of dimensions for an enhanced visualization of multidimensional data , 1998, Proceedings IEEE Symposium on Information Visualization (Cat. No.98TB100258).

[25]  Michael Friendly,et al.  Effect ordering for data displays , 2003, Comput. Stat. Data Anal..

[26]  Enrico Bertini,et al.  Quality Metrics in High-Dimensional Data Visualization: An Overview and Systematization , 2011, IEEE Transactions on Visualization and Computer Graphics.

[27]  John J. Bertin,et al.  The semiology of graphics , 1983 .

[28]  Jing Yang,et al.  Visual Hierarchical Dimension Reduction , 2002 .

[29]  Robert Kosara,et al.  Pargnostics: Screen-Space Metrics for Parallel Coordinates , 2010, IEEE Transactions on Visualization and Computer Graphics.

[30]  Diansheng Guo,et al.  Coordinating Computational and Visual Approaches for Interactive Feature Selection and Multivariate Clustering , 2003, Inf. Vis..

[31]  Michael Hahsler,et al.  Getting Things in Order: An Introduction to the R Package seriation , 2008 .

[32]  Ramana Rao,et al.  The table lens: merging graphical and symbolic representations in an interactive focus + context visualization for tabular information , 1994, CHI '94.