Binary Mesh Partitioning for Cache-Efficient Visualization

One important bottleneck when visualizing large data sets is the data transfer between processor and memory. Cacheaware (CA) and cache-oblivious (CO) algorithms take into consideration the memory hierarchy to design cache efficient algorithms. CO approaches have the advantage to adapt to unknown and varying memory hierarchies. Recent CA and CO algorithms developed for 3D mesh layouts significantly improve performance of previous approaches, but they lack of theoretical performance guarantees. We present in this paper a O(N log N) algorithm to compute a CO layout for unstructured but well shaped meshes. We prove that a coherent traversal of a JV-size mesh in dimension d induces less than N/B + O(N/M1/d) cache-misses where B and M are the block size and the cache size, respectively. Experiments show that our layout computation is faster and significantly less memory consuming than the best known CO algorithm. Performance is comparable to this algorithm for classical visualization algorithm access patterns, or better when the BSP tree produced while computing the layout is used as an acceleration data structure adjusted to the layout. We also show that cache oblivious approaches lead to significant performance increases on recent GPU architectures.

[1]  P. Shirley,et al.  A polygonal approximation to direct scalar volume rendering , 1990, VVS.

[2]  David Eppstein,et al.  Approximating center points with iterated radon points , 1993, SCG '93.

[3]  Subodh Kumar,et al.  Geometry engine optimization: cache friendly compressed representation of geometry , 2007, SI3D.

[4]  Ulrich Meyer,et al.  Algorithms for Memory Hierarchies , 2003, Lecture Notes in Computer Science.

[5]  Jack J. Dongarra,et al.  A Portable Programming Interface for Performance Evaluation on Modern Processors , 2000, Int. J. High Perform. Comput. Appl..

[6]  V. Pascucci,et al.  Global Static Indexing for Real-Time Exploration of Very Large Regular Grids , 2001, ACM/IEEE SC 2001 Conference (SC'01).

[7]  Charles E. Leiserson,et al.  Cache-Oblivious Algorithms , 2003, CIAC.

[8]  J. Wilhelms,et al.  Octrees for faster isosurface generation , 1992, TOGS.

[9]  Bruno Raffin,et al.  Binary Mesh Partitioning for Cache-Efficient Processing , 2009 .

[10]  Dinesh Manocha,et al.  Cache‐Efficient Layouts of Bounding Volume Hierarchies , 2006, Comput. Graph. Forum.

[11]  Martin Isenburg,et al.  Streaming meshes , 2005, VIS 05. IEEE Visualization, 2005..

[12]  Gang Lin,et al.  An improved vertex caching scheme for 3D mesh rendering , 2006, IEEE Transactions on Visualization and Computer Graphics.

[13]  S. Vavasis,et al.  Geometric Separators for Finite-Element Meshes , 1998, SIAM J. Sci. Comput..

[14]  William Schroeder,et al.  The Visualization Toolkit: An Object-Oriented Approach to 3-D Graphics , 1997 .

[15]  R. C. Whaley,et al.  Minimizing development and maintenance costs in supporting persistently optimized BLAS , 2005, Softw. Pract. Exp..

[16]  Pedro V. Sander,et al.  Fast triangle reordering for vertex locality and reduced overdraw , 2007, SIGGRAPH 2007.

[17]  Alok Aggarwal,et al.  The input/output complexity of sorting and related problems , 1988, CACM.

[18]  Craig Gotsman,et al.  Universal Rendering Sequences for Transparent Vertex Caching of Progressive Meshes , 2002, Comput. Graph. Forum.

[19]  Cláudio T. Silva,et al.  Simple, Fast, and Robust Ray Casting of Irregular Grids , 1997, Scientific Visualization Conference (dagstuhl '97).

[20]  D. Manocha,et al.  Cache-oblivious mesh layouts , 2005, ACM Trans. Graph..

[21]  Cláudio T. Silva,et al.  External memory techniques for isosurface extraction in scientific visualization , 1998, External Memory Algorithms.

[22]  Peter Lindstrom,et al.  Mesh Layouts for Block-Based Caches , 2006, IEEE Transactions on Visualization and Computer Graphics.

[23]  Antoine Petitet,et al.  Minimizing development and maintenance costs in supporting persistently optimized BLAS , 2005 .

[24]  Meenakshisundaram Gopi,et al.  Single-strips for fast interactive rendering , 2006, The Visual Computer.

[25]  Michael Bader,et al.  Cache oblivious matrix multiplication using an element ordering based on the Peano curve , 2006 .

[26]  Jop F. Sibeyn,et al.  Algorithms for Memory Hierarchies: Advanced Lectures , 2003 .

[27]  Cláudio T. Silva,et al.  Interactive out-of-core isosurface extraction , 1998, Proceedings Visualization '98 (Cat. No.98CB36276).

[28]  Yi-Jen Chiang,et al.  I/O optimal isosurface extraction , 1997, Proceedings. Visualization '97 (Cat. No. 97CB36155).

[29]  Keshav Pingali,et al.  An experimental comparison of cache-oblivious and cache-conscious programs , 2007, SPAA '07.

[30]  Cláudio T. Silva,et al.  Hardware-assisted visibility sorting for unstructured volume rendering , 2005, IEEE Transactions on Visualization and Computer Graphics.

[31]  Hugues Hoppe,et al.  Optimization of mesh locality for transparent vertex caching , 1999, SIGGRAPH.