Visual analysis of high dimensional point clouds using topological landscapes

In this paper, we present a novel three-stage process to visualize the structure of point clouds in arbitrary dimensions. To get insight into the structure and complexity of a data set, we would most preferably just look into it, e.g. by plotting its corresponding point cloud. Unfortunately, for orthogonal scatter plots, this only works up to three dimensions, and other visualizations, like parallel coordinates or scatterplot matrices, also have problems handling many dimensions and visual overlap of data entities. The presented solution tackles the problem of visualizing point clouds indirectly by visualizing the topology of their density distribution. The benefit of this approach is that this topology can be computed in arbitrary dimensions. Similar to examining scatter plots, this gives the important information like the number, size and nesting structure of accumulated regions. We view our approach as an alternative to cluster visualization. To create the visualization, we first estimate the density function using a novel high-dimensional interpolation scheme. Second, we compute that function's topology by means of the join tree, generate a corresponding 3-D terrain using the topological landscape metaphor introduced by Weber et al. (2007), and finally augment that landscape by placing the original data points at suitable locations.

[1]  Han-Wei Shen,et al.  A Near Optimal Isosurface Extraction Algorithm Using the Span Space , 1996, IEEE Trans. Vis. Comput. Graph..

[2]  Valerio Pascucci,et al.  Topological Landscapes: A Terrain Metaphor for Scientific Data , 2007, IEEE Transactions on Visualization and Computer Graphics.

[3]  Alfred Inselberg,et al.  Parallel coordinates: a tool for visualizing multi-dimensional geometry , 1990, Proceedings of the First IEEE Conference on Visualization: Visualization `90.

[4]  Daniel A. Keim,et al.  An Efficient Approach to Clustering in Large Multimedia Databases with Noise , 1998, KDD.

[5]  William E. Lorensen,et al.  Marching cubes: A high resolution 3D surface construction algorithm , 1987, SIGGRAPH.

[6]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[7]  Bowen Alpern,et al.  The hyperbox , 1991, Proceeding Visualization '91.

[8]  W. Relative Neighborhood Graphs and Their Relatives , 2004 .

[9]  Dimitrios Gunopulos,et al.  Automatic Subspace Clustering of High Dimensional Data , 2005, Data Mining and Knowledge Discovery.

[10]  Jack Snoeyink,et al.  Computing contour trees in all dimensions , 2000, SODA '00.

[11]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[12]  B. Marx The Visual Display of Quantitative Information , 1985 .

[13]  Philip S. Yu,et al.  Fast algorithms for projected clustering , 1999, SIGMOD '99.

[14]  Steven Fortune,et al.  Voronoi Diagrams and Delaunay Triangulations , 2004, Handbook of Discrete and Computational Geometry, 2nd Ed..

[15]  Jiong Yang,et al.  STING: A Statistical Information Grid Approach to Spatial Data Mining , 1997, VLDB.

[16]  Valerio Pascucci,et al.  Multi-Resolution computation and presentation of Contour Trees , 2005 .

[17]  Herbert Edelsbrunner,et al.  Simulation of simplicity: a technique to cope with degenerate cases in geometric algorithms , 1988, SCG '88.

[18]  P. Fayers,et al.  The Visual Display of Quantitative Information , 1990 .

[19]  Ian T. Jolliffe,et al.  Principal Component Analysis , 2002, International Encyclopedia of Statistical Science.

[20]  Pierre Dragicevic,et al.  Rolling the Dice: Multidimensional Visual Exploration using Scatterplot Matrix Navigation , 2008, IEEE Transactions on Visualization and Computer Graphics.

[21]  Bernd Hamann,et al.  Detecting Critical Regions in Scalar Fields , 2003, VisSym.

[22]  Hans-Peter Kriegel,et al.  Recursive pattern: a technique for visualizing very large amounts of data , 1995, Proceedings Visualization '95.

[23]  R. Sokal,et al.  A New Statistical Approach to Geographic Variation Analysis , 1969 .

[24]  Richard A. Becker,et al.  Brushing scatterplots , 1987 .

[25]  E. M. Wright,et al.  Adaptive Control Processes: A Guided Tour , 1961, The Mathematical Gazette.

[26]  Masato Okada,et al.  Applying Manifold Learning to Plotting Approximate Contour Trees , 2009, IEEE Transactions on Visualization and Computer Graphics.

[27]  Graham K. Rand,et al.  Quantitative Applications in the Social Sciences , 1983 .

[28]  Herbert Edelsbrunner,et al.  Topological persistence and simplification , 2000, Proceedings 41st Annual Symposium on Foundations of Computer Science.

[29]  Hans-Peter Kriegel,et al.  'Circle Segments': A Technique for Visually Exploring Large Multidimensional Data Sets , 1996 .

[30]  Teuvo Kohonen,et al.  Self-Organizing Maps , 2010 .

[31]  John M. Chambers,et al.  Graphical Methods for Data Analysis , 1983 .

[32]  Aidong Zhang,et al.  WaveCluster: A Multi-Resolution Clustering Approach for Very Large Spatial Databases , 1998, VLDB.

[33]  Marc Levoy,et al.  Display of surfaces from volume data , 1988, IEEE Computer Graphics and Applications.

[34]  A. Buja,et al.  Prosection Views: Dimensional Inference through Sections and Projections , 1994 .

[35]  Roger L. Boyell,et al.  Hybrid techniques for real-time radar simulation , 1963, AFIPS '63 (Fall).

[36]  Karl-Heinrich Anders,et al.  Parameterfreies hierarchisches Graph-Clustering-Verfahren zur Interpretation raumbezogener Daten , 2004 .