High-Dimensional Data Visualization by Interactive Construction of Low-Dimensional Parallel Coordinate Plots

Abstract Parallel coordinate plots (PCPs) are among the most useful techniques for the visualization and exploration of high-dimensional data spaces. They are especially useful for the representation of correlations among the dimensions, which identify relationships and interdependencies between variables. However, within these high-dimensional spaces, PCPs face difficulties in displaying the correlation between combinations of dimensions and generally require additional display space as the number of dimensions increases. In this paper, we present a new technique for high-dimensional data visualization in which a set of low-dimensional PCPs are interactively constructed by sampling user-selected subsets of the high-dimensional data space. In our technique, we first construct a graph visualization of sets of well-correlated dimensions. Users observe this graph and are able to interactively select the dimensions by sampling from its cliques, thereby dynamically specifying the most relevant lower dimensional data to be used for the construction of focused PCPs. Our interactive sampling overcomes the shortcomings of the PCPs by enabling the visualization of the most meaningful dimensions (i.e., the most relevant information) from high-dimensional spaces. We demonstrate the effectiveness of our technique through two case studies, where we show that the proposed interactive low-dimensional space constructions were pivotal for visualizing the high-dimensional data and discovering new patterns.

[1]  BronCoen,et al.  Algorithm 457: finding all cliques of an undirected graph , 1973 .

[2]  Gwénolé Quellec,et al.  Medical Case Retrieval From a Committee of Decision Trees , 2010, IEEE Transactions on Information Technology in Biomedicine.

[3]  Xiaoru Yuan,et al.  Dimension Projection Matrix/Tree: Interactive Subspace Visual Exploration and Analysis of High Dimensional Data , 2013, IEEE Transactions on Visualization and Computer Graphics.

[4]  Yoshinobu Kawahara,et al.  Arrangement of Low-Dimensional Parallel Coordinate Plots for High-Dimensional Data Visualization , 2013, 2013 17th International Conference on Information Visualisation.

[5]  Richard C. Pais,et al.  The Lung Image Database Consortium (LIDC) and Image Database Resource Initiative (IDRI): a completed reference database of lung nodules on CT scans. , 2011, Medical physics.

[6]  Pierre Dragicevic,et al.  Rolling the Dice: Multidimensional Visual Exploration using Scatterplot Matrix Navigation , 2008, IEEE Transactions on Visualization and Computer Graphics.

[7]  Klaus Mueller,et al.  Ieee Transactions on Visualization and Computer Graphics 1 Visual Correlation Analysis of Numerical and Categorical Data on the Correlation Map , 2022 .

[8]  Jarke J. van Wijk,et al.  Flexible Linked Axes for Multivariate Data Visualization , 2011, IEEE Transactions on Visualization and Computer Graphics.

[9]  Domonkos Tikk,et al.  Research Paper: Semantic Classification of Diseases in Discharge Summaries Using a Context-aware Rule-based Classifier , 2009, J. Am. Medical Informatics Assoc..

[10]  Natasha M. Maurits,et al.  Tiled Parallel Coordinates for the Visualization of Time-Varying Multichannel EEG Data , 2005, EuroVis.

[11]  Matthew O. Ward,et al.  Value and Relation Display: Interactive Visual Exploration of Large Data Sets with Hundreds of Dimensions , 2007, IEEE Trans. Vis. Comput. Graph..

[12]  Karsten Klein,et al.  A Visual Analytics Approach Using the Exploration of Multidimensional Feature Spaces for Content-Based Medical Image Retrieval , 2015, IEEE Journal of Biomedical and Health Informatics.

[13]  J. A. Hartigan,et al.  A k-means clustering algorithm , 1979 .

[14]  John P. Lewis,et al.  Eurographics/ Ieee-vgtc Symposium on Visualization 2009 Selecting Good Views of High-dimensional Data Using Class Consistency , 2022 .

[15]  C. Bron,et al.  Algorithm 457: finding all cliques of an undirected graph , 1973 .

[16]  Jacob D. Furst,et al.  An investigation into the relationship between semantic and content based similarity using LIDC , 2010, MIR '10.

[17]  Matthew O. Ward,et al.  Clutter Reduction in Multi-Dimensional Data Visualization Using Dimension Reordering , 2004 .

[18]  Leland Wilkinson,et al.  ScagExplorer: Exploring Scatterplots by Their Scagnostics , 2014, 2014 IEEE Pacific Visualization Symposium.

[19]  Jos B. T. M. Roerdink,et al.  Visualizing High‐Dimensional Structures by Dimension Ordering and Filtering using Subspace Analysis , 2011, Comput. Graph. Forum.

[20]  David Dagan Feng,et al.  Content-Based Medical Image Retrieval: A Survey of Applications to Multidimensional and Multimodality Data , 2013, Journal of Digital Imaging.

[21]  Matthew O. Ward,et al.  Hierarchical parallel coordinates for exploration of large datasets , 1999, Proceedings Visualization '99 (Cat. No.99CB37067).

[22]  Nie Yong Mining quantitative association rules , 2000 .

[23]  S. Matthews,et al.  Multislice Computed Tomography in Staging Lung Cancer: The Role of Multiplanar Image Reconstruction , 2005, Journal of computer assisted tomography.

[24]  Yifan Hu,et al.  A Maxent-Stress Model for Graph Layout , 2012, IEEE Transactions on Visualization and Computer Graphics.

[25]  Didier Rognan,et al.  Beware of Machine Learning-Based Scoring Functions - On the Danger of Developing Black Boxes , 2014, J. Chem. Inf. Model..

[26]  Bernd Hamann,et al.  Progressive parallel coordinates , 2012, 2012 IEEE Pacific Visualization Symposium.

[27]  Pak Chung Wong,et al.  30 Years of Multidimensional Multivariate Visualization , 1994, Scientific Visualization.

[28]  Arvid Lundervold,et al.  Representative Factor Generation for the Interactive Visual Analysis of High-Dimensional Data , 2012, IEEE Transactions on Visualization and Computer Graphics.

[29]  Klaus Mueller,et al.  A network-based interface for the exploration of high-dimensional data spaces , 2012, 2012 IEEE Pacific Visualization Symposium.

[30]  Kazuhiro Nakahashi,et al.  Navier-Stokes Optimization of Supersonic Wings with Four Objectives Using Evolutionary Algorithm , 2002 .

[31]  Daisuke Sasaki,et al.  Multiobjective evolutionary computation for supersonic wing-shape optimization , 2000, IEEE Trans. Evol. Comput..

[32]  Yasuhiko Morimoto,et al.  Mining optimized association rules for numeric attributes , 1996, J. Comput. Syst. Sci..

[33]  Matthew O. Ward,et al.  Clutter Reduction in Multi-Dimensional Data Visualization Using Dimension Reordering , 2004, IEEE Symposium on Information Visualization.

[34]  Valerio Pascucci,et al.  Distortion‐Guided Structure‐Driven Interactive Exploration of High‐Dimensional Data , 2014, Comput. Graph. Forum.

[35]  Yoshinobu Kawahara,et al.  Scatterplot layout for high-dimensional data visualization , 2015, J. Vis..

[36]  Peter Filzmoser,et al.  Brushing Dimensions - A Dual Visual Analysis Model for High-Dimensional Data , 2011, IEEE Transactions on Visualization and Computer Graphics.

[37]  Alfred Inselberg,et al.  Parallel coordinates for visualizing multi-dimensional geometry , 1987 .

[38]  Ramakrishnan Srikant,et al.  Mining quantitative association rules in large relational tables , 1996, SIGMOD '96.

[39]  Hong Zhou,et al.  Visual Clustering in Parallel Coordinates , 2008, Comput. Graph. Forum.

[40]  Alfred Inselberg,et al.  Parallel coordinates: a tool for visualizing multi-dimensional geometry , 1990, Proceedings of the First IEEE Conference on Visualization: Visualization `90.

[41]  M. Cooper,et al.  Revealing structure within clustered parallel coordinates displays , 2005, IEEE Symposium on Information Visualization, 2005. INFOVIS 2005..

[42]  R. Grossman,et al.  Graph-theoretic scagnostics , 2005, IEEE Symposium on Information Visualization, 2005. INFOVIS 2005..

[43]  Kazuho Watanabe,et al.  Spectral-Based Contractible Parallel Coordinates , 2014, 2014 18th International Conference on Information Visualisation.

[44]  Klaus Mueller,et al.  A Structure-Based Distance Metric for High-Dimensional Space Exploration with Multidimensional Scaling , 2014, IEEE Trans. Vis. Comput. Graph..

[45]  John T. Stasko,et al.  The Parallel Coordinates Matrix , 2012, EuroVis.

[46]  Peter Eades,et al.  A Heuristics for Graph Drawing , 1984 .