Geometric and Statistical Summaries for Big Data Visualization

In recent times, the visualization and data analysis paradigm is adapting fast to keep up with the rapid growth in computing power and data size. Modern scientific simulations run at massive scale to produce huge datasets, which must be analyzed and visualized by the domain experts to continue innovation. In the presence of large-scale data, it is important to identify and extract the informative regions at an early stage so that the following analysis algorithms, which are usually memory and compute-intensive, can focus only on those regions. Transforming the raw data to a compact yet meaningful representation also helps to maintain the interactivity of the query and visualization of analysis results. In this dissertation, we propose a novel and general-purpose framework suitable for exploring large-scale data. We propose to use importance-based data summaries, which can substitute for the raw data to answer queries and drive visual exploration. Since the definition of importance is dependent on the nature of the data and the task at hand, we propose to use suitable statistical and geometric measures or combination of various measures to quantify importance and perform data reduction on scalar and vector field data. Our research demonstrates two instances of the proposed framework. The first instance applies to large number of streamlines computed from vector fields. We make the visual exploration of such data much easier compared to navigating through a cluttered 3D visualization of the raw data. In this case, we introduce a fractal dimension based metric called box counting ratio, which quantifies the geometric complexity of

[1]  Akio Arakawa,et al.  Integration of the Nondivergent Barotropic Vorticity Equation with AN Icosahedral-Hexagonal Grid for the SPHERE1 , 1968 .

[2]  Valerio Pascucci,et al.  Gaussian mixture model based volume visualization , 2012, IEEE Symposium on Large Data Analysis and Visualization (LDAV).

[3]  Gerik Scheuermann,et al.  Clifford convolution and pattern matching on vector fields , 2003, IEEE Visualization, 2003. VIS 2003..

[4]  P. M. Sutter,et al.  Detecting Dark Matter-Dark Energy Coupling with the Halo Mass Function , 2008, 0804.4172.

[5]  Arnaud E. Jacquin,et al.  Image coding based on a fractal theory of iterated contractive image transformations , 1992, IEEE Trans. Image Process..

[6]  Henk Corporaal,et al.  High performance predictable histogramming on GPUs: exploring and evaluating algorithm trade-offs , 2011, GPGPU-4.

[7]  Michael Mayer,et al.  Interactive Feature Specification for Simulation Data on Time-Varying Grids , 2005, SimVis.

[8]  Suresh K. Lodha,et al.  Topology Preserving Top-Down Compression of 2D Vector Fields Using Bintree and Triangular Quadtrees , 2003, IEEE Trans. Vis. Comput. Graph..

[9]  Hans Hagen,et al.  Continuous topology simplification of planar vector fields , 2001, Proceedings Visualization, 2001. VIS '01..

[10]  Raghu Machiraju,et al.  Geometric verification of swirling features in flow fields , 2002, IEEE Visualization, 2002. VIS 2002..

[11]  Stefan Bruckner,et al.  Eurographics/ Ieee-vgtc Symposium on Visualization 2010 Isosurface Similarity Maps , 2022 .

[12]  Jiann-Liang Chen,et al.  Normalized-cut algorithm for hierarchical vector field data segmentation , 2003, IS&T/SPIE Electronic Imaging.

[13]  H. Hotelling Analysis of a complex of statistical variables into principal components. , 1933 .

[14]  Ching-Kuang Shene,et al.  Hierarchical Streamline Bundles , 2012, IEEE Transactions on Visualization and Computer Graphics.

[15]  Mateu Sbert,et al.  Multimodal Data Fusion Based on Mutual Information , 2012, IEEE Transactions on Visualization and Computer Graphics.

[16]  S. Sheather Density Estimation , 2004 .

[17]  Jens H. Krüger,et al.  Sparse PDF maps for non-linear multi-resolution image operations , 2012, ACM Trans. Graph..

[18]  Suresh K. Lodha,et al.  Topology preserving compression of 2D vector fields , 2000, Proceedings Visualization 2000. VIS 2000 (Cat. No.00CH37145).

[19]  Hans Hagen,et al.  A topology simplification method for 2D vector fields , 2000 .

[20]  S. Lennart Johnsson,et al.  Histogram Computation on Distributed Memory Architectures , 1989, Concurr. Pract. Exp..

[21]  Harald Garcke,et al.  A continuous clustering method for vector fields , 2000, Proceedings Visualization 2000. VIS 2000 (Cat. No.00CH37145).

[22]  A. Bowman An alternative method of cross-validation for the smoothing of density estimates , 1984 .

[23]  Kenneth Falconer,et al.  Fractal Geometry: Mathematical Foundations and Applications , 1990 .

[24]  Tom Clemo,et al.  Morphology of Inflated Pahoehoe Lavas and Spatial Architecture of Their Porous and Permeable Zones, Eastern Snake River Plain, Idaho , 2002 .

[25]  Yi Gu,et al.  TransGraph: Hierarchical Exploration of Transition Relationships in Time-Varying Volumetric Data , 2011, IEEE Transactions on Visualization and Computer Graphics.

[26]  C. J. Stone,et al.  An Asymptotically Optimal Window Selection Rule for Kernel Density Estimates , 1984 .

[27]  Pak Chung Wong,et al.  Exploring vector fields with distribution-based streamline analysis , 2013, 2013 IEEE Pacific Visualization Symposium (PacificVis).

[28]  Lijie Xu,et al.  An Information-Theoretic Framework for Flow Visualization , 2010, IEEE Transactions on Visualization and Computer Graphics.

[29]  Wenbin Chen,et al.  Segmentation of discrete vector fields , 2006, IEEE Transactions on Visualization and Computer Graphics.

[30]  Holger Theisel,et al.  Uncertain topology of 3D vector fields , 2011, 2011 IEEE Pacific Visualization Symposium.

[31]  M. C. Jones,et al.  A reliable data-based bandwidth selection method for kernel density estimation , 1991 .

[32]  Kenneth I. Joy,et al.  An Application of Multivariate Statistical Analysis for Query-Driven Visualization , 2011, IEEE Transactions on Visualization and Computer Graphics.

[33]  Rodney A. Kennedy,et al.  Parallel computation of mutual information on the GPU with application to real-time registration of 3D medical images , 2010, Comput. Methods Programs Biomed..

[34]  Cláudio T. Silva,et al.  Interactive Vector Field Feature Identification , 2010, IEEE Transactions on Visualization and Computer Graphics.

[35]  Dietmar Saupe Fractal image compression via nearest neighbor search , 1998 .

[36]  Kwan-Liu Ma,et al.  View-Dependent Streamlines for 3D Vector Fields , 2010, IEEE Transactions on Visualization and Computer Graphics.

[37]  Vassilicos,et al.  Self-similar spiral flow structure in low Reynolds number isotropic and decaying turbulence. , 1996, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[38]  Gordon L. Kindlmann,et al.  Semi-Automatic Generation of Transfer Functions for Direct Volume Rendering , 1998, VVS.

[39]  Han-Wei Shen,et al.  Transformations for volumetric range distribution queries , 2013, 2013 IEEE Pacific Visualization Symposium (PacificVis).

[40]  Bernd Hamann,et al.  Construction of vector field hierarchies , 1999, Proceedings Visualization '99 (Cat. No.99CB37067).

[41]  Rephael Wenger,et al.  On the Fractal Dimension of Isosurfaces , 2010, IEEE Transactions on Visualization and Computer Graphics.

[42]  Dongbin Xiu,et al.  INTERACTIVE VISUALIZATION OF PROBABILITY AND CUMULATIVE DENSITY FUNCTIONS. , 2012, International journal for uncertainty quantification.

[43]  Abraham Lempel,et al.  A universal algorithm for sequential data compression , 1977, IEEE Trans. Inf. Theory.

[44]  Alexandru Telea,et al.  Simplified representation of vector fields , 1999, Proceedings Visualization '99 (Cat. No.99CB37067).

[45]  Hans-Christian Hege,et al.  Positional Uncertainty of Isocontours: Condition Analysis and Probabilistic Measures , 2011, IEEE Transactions on Visualization and Computer Graphics.

[46]  M. Maltrud,et al.  An eddy resolving global 1/10° ocean simulation , 2005 .

[47]  Valerio Pascucci,et al.  The contour spectrum , 1997 .

[48]  Justin Hensley,et al.  Efficient histogram generation using scattering on GPUs , 2007, SI3D.

[49]  Eli Shechtman,et al.  Matching Local Self-Similarities across Images and Videos , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[50]  Bernd Hamann,et al.  Topological segmentation in three-dimensional vector fields , 2004, IEEE Transactions on Visualization and Computer Graphics.

[51]  Daniel Cremers,et al.  GPU histogram computation , 2006, SIGGRAPH '06.

[52]  Hamish A. Carr,et al.  On Histograms and Isosurface Statistics , 2006, IEEE Transactions on Visualization and Computer Graphics.

[53]  Robert Latham,et al.  ISABELA-QA: Query-driven analytics with ISABELA-compressed extreme-scale scientific data , 2011, 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[54]  Francesco Buccafurri,et al.  Improving range query estimation on histograms , 2002, Proceedings 18th International Conference on Data Engineering.

[55]  Jeffrey Scott Vitter,et al.  Wavelet-based histograms for selectivity estimation , 1998, SIGMOD '98.

[56]  Boming Yu,et al.  A fractal permeability model for bi-dispersed porous media , 2002 .

[57]  Gerik Scheuermann,et al.  Streamline Predicates , 2006, IEEE Transactions on Visualization and Computer Graphics.

[58]  Norbert Seehafer,et al.  Bifurcations and chaos in an array of forced vortices , 1997 .

[59]  Bernd Hamann,et al.  Moment Invariants for the Analysis of 2D Flow Fields , 2007, IEEE Transactions on Visualization and Computer Graphics.

[60]  Christian Rössl,et al.  Streamline Embedding for 3D Vector Field Exploration , 2012, IEEE Transactions on Visualization and Computer Graphics.

[61]  Konstantin Mischaikow,et al.  Efficient Morse Decompositions of Vector Fields , 2008, IEEE Transactions on Visualization and Computer Graphics.

[62]  Catrakis Distribution of scales in turbulence , 2000, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[63]  Martin Rumpf,et al.  Flow field clustering via algebraic multigrid , 2004, IEEE Visualization 2004.

[64]  D. W. Scott On optimal and data based histograms , 1979 .

[65]  Shigeru Shinomoto,et al.  A Method for Selecting the Bin Size of a Time Histogram , 2007, Neural Computation.

[66]  Paul C. Leopardi Distributing points on the sphere: partitions, separation, quadrature and energy , 2007 .

[67]  Fatih Murat Porikli,et al.  Integral histogram: a fast way to extract histograms in Cartesian spaces , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[68]  Jun Ma,et al.  FlowGraph: A compound hierarchical graph for flow field exploration , 2013, 2013 IEEE Pacific Visualization Symposium (PacificVis).

[69]  Arie Shoshani,et al.  Analyses of multi-level and multi-component compressed bitmap indexes , 2010, TODS.

[70]  David L. Williamson,et al.  Integration of the barotropic vorticity equation on a spherical geodesic grid , 1968 .

[71]  John Christos Vassilicos,et al.  Fractal dimensions and spectra of interfaces with application to turbulence , 1991, Proceedings of the Royal Society of London. Series A: Mathematical and Physical Sciences.

[72]  Valerio Pascucci,et al.  Analysis of large-scale scalar data using hixels , 2011, 2011 IEEE Symposium on Large Data Analysis and Visualization.

[73]  Michael F. Barnsley,et al.  Fractals everywhere , 1988 .

[74]  Yannis E. Ioannidis,et al.  Selectivity Estimation Without the Attribute Value Independence Assumption , 1997, VLDB.

[75]  D Baltas,et al.  Optimized bounding boxes for three-dimensional treatment planning in brachytherapy. , 2000, Medical physics.

[76]  Kwan-Liu Ma,et al.  Simultaneous Classification of Time-Varying Volume Data Based on the Time Histogram , 2006, EuroVis.

[77]  Hans-Christian Hege,et al.  Uncertain 2D Vector Field Topology , 2010, Comput. Graph. Forum.

[78]  M. Barnsley,et al.  A new class of markov processes for image encoding , 1988, Advances in Applied Probability.

[79]  Jian Huang,et al.  Distribution-Driven Visualization of Volume Data , 2009, IEEE Transactions on Visualization and Computer Graphics.

[80]  Bernd Girod,et al.  Compressed Histogram of Gradients: A Low-Bitrate Descriptor , 2011, International Journal of Computer Vision.

[81]  Gérard G. Medioni,et al.  Mutual information computation and maximization using GPU , 2008, 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[82]  Wei Liu,et al.  Fractal Analysis of Permeabilities for Porous Media , 2004 .

[83]  Santiago V. Lombeyda,et al.  Discrete multiscale vector field decomposition , 2003, ACM Trans. Graph..

[84]  P. Fischer,et al.  High-Order Methods for Incompressible Fluid Flow , 2002 .

[85]  L. Portela Identification and characterization of vortices in the turbulent boundary layer , 1998 .

[86]  Hans-Georg Pagendarm,et al.  Selective visualization of vortices in hydrodynamic flows , 1998 .

[87]  Anders Ynnerman,et al.  Local Histograms for Design of Transfer Functions in Direct Volume Rendering , 2006, IEEE Transactions on Visualization and Computer Graphics.

[88]  Nick Barnes,et al.  Speeding up Mutual Information Computation Using NVIDIA CUDA Hardware , 2007, 9th Biennial Conference of the Australian Pattern Recognition Society on Digital Image Computing Techniques and Applications (DICTA 2007).

[89]  Hans Hagen,et al.  Vector and Tensor Field Topology Simplification on Irregular Grids , 2001, VisSym.

[90]  Chris Henze Feature detection in linked derived spaces , 1998, Proceedings Visualization '98 (Cat. No.98CB36276).

[91]  Lijie Xu,et al.  Flow Web: a graph based user interface for 3D flow field exploration , 2010, Electronic Imaging.

[92]  C. C. Barton Fractal Analysis of Scaling and Spatial Clustering of Fractures , 1995 .

[93]  Cong Wang,et al.  Scalable computation of distributions from large scale data sets , 2012, IEEE Symposium on Large Data Analysis and Visualization (LDAV).

[94]  Lambertus Hesselink,et al.  Visualizing vector field topology in fluid flows , 1991, IEEE Computer Graphics and Applications.

[95]  Hans-Peter Seidel,et al.  Path Line Attributes - an Information Visualization Approach to Analyzing the Dynamic Behavior of 3D Time-Dependent Flow Fields , 2009, Topology-Based Methods in Visualization II.

[96]  John Amanatides,et al.  A Fast Voxel Traversal Algorithm for Ray Tracing , 1987, Eurographics.

[97]  D. Freedman,et al.  On the histogram as a density estimator:L2 theory , 1981 .

[98]  Franklin C. Crow,et al.  Summed-area tables for texture mapping , 1984, SIGGRAPH.

[99]  Roger Crawfis,et al.  View point evaluation and streamline filtering for flow visualization , 2011, 2011 IEEE Pacific Visualization Symposium.

[100]  Prabhat,et al.  FastBit: interactively searching massive data , 2009 .

[101]  M. Rudemo Empirical Choice of Histograms and Kernel Density Estimators , 1982 .

[102]  Charles D. Hansen,et al.  Isosurfacing in span space with utmost efficiency (ISSUE) , 1996, Proceedings of Seventh Annual IEEE Visualization '96.