Eurographics Symposium on Parallel Graphics and Visualization (2013) Analysis of Cache Behavior and Performance of Different Bvh Memory Layouts for Tracing Incoherent Rays

With CPUs moving towards many-core architectures and GPUs becoming more general purpose architectures, path tracing can now be well parallelized on commodity hardware. While parallelization is trivial in theory, properties of real hardware make efficient parallelization difficult, especially when tracing incoherent rays. We investigate how different bounding volume hierarchy (BVH) and node memory layouts as well as storing the BVH in different memory areas impacts the ray tracing performance of a GPU path tracer. We optimize the BVH layout using information gathered in a pre-processing pass applying a number of different BVH reordering techniques. Depending on the memory area and scene complexity, we achieve moderate speedups.

[1]  Dietger van Antwerpen,et al.  Improving SIMD efficiency for parallel Monte Carlo light transport on the GPU , 2011, HPG '11.

[2]  Tero Karras,et al.  Architecture considerations for tracing incoherent rays , 2010, HPG '10.

[3]  Charles T. Loop,et al.  Fast Ray Sorting and Breadth‐First Packet Traversal for GPU Ray Tracing , 2010, Comput. Graph. Forum.

[4]  Andreas Moshovos,et al.  Demystifying GPU microarchitecture through microbenchmarking , 2010, 2010 IEEE International Symposium on Performance Analysis of Systems & Software (ISPASS).

[5]  Sung-eui Yoon,et al.  RACBVHs: Random-Accessible Compressed Bounding Volume Hierarchies , 2009, IEEE Transactions on Visualization and Computer Graphics.

[6]  Timo Aila,et al.  Understanding the efficiency of ray traversal on GPUs , 2009, High Performance Graphics.

[7]  Andreas Dietrich,et al.  Spatial splits in bounding volume hierarchies , 2009, High Performance Graphics.

[8]  S. Boulos,et al.  Getting rid of packets - Efficient SIMD single-ray traversal using multi-branching BVHs - , 2008, 2008 IEEE Symposium on Interactive Ray Tracing.

[9]  G. Greiner,et al.  Multi bounding volume hierarchies , 2008, 2008 IEEE Symposium on Interactive Ray Tracing.

[10]  Alexander Keller,et al.  Shallow Bounding Volume Hierarchies for Fast SIMD Ray Tracing of Incoherent Rays , 2008, Comput. Graph. Forum.

[11]  Peter Lindstrom,et al.  Random-Accessible Compressed Triangle Meshes , 2007, IEEE Transactions on Visualization and Computer Graphics.

[12]  P.A. Navratil,et al.  Dynamic Ray Scheduling to Improve Ray Coherence and Bandwidth Utilization , 2007, 2007 IEEE Symposium on Interactive Ray Tracing.

[13]  Hans-Peter Seidel,et al.  Stackless KD‐Tree Traversal for High Performance GPU Ray Tracing , 2007, Comput. Graph. Forum.

[14]  K. Torrance,et al.  Microfacet Models for Refraction through Rough Surfaces , 2007, Rendering Techniques.

[15]  Jan Kautz,et al.  Packet-based whitted and distribution ray tracing , 2007, GI '07.

[16]  Dinesh Manocha,et al.  Cache‐Efficient Layouts of Bounding Volume Hierarchies , 2006, Comput. Graph. Forum.

[17]  Michael A. Bender,et al.  Efficient Tree Layout in a Multilevel Memory Hierarchy , 2002, ESA.

[18]  Markus Wagner,et al.  Interactive Rendering with Coherent Ray Tracing , 2001, Comput. Graph. Forum.

[19]  Matteo Frigo,et al.  Cache-oblivious algorithms , 1999, 40th Annual Symposium on Foundations of Computer Science (Cat. No.99CB37039).

[20]  A. Itai,et al.  How to Pack Trees , 1999, J. Algorithms.

[21]  Pat Hanrahan,et al.  Rendering complex scenes with memory-coherent ray tracing , 1997, SIGGRAPH.

[22]  Peter van Emde Boas,et al.  Preserving Order in a Forest in Less Than Logarithmic Time and Linear Space , 1977, Inf. Process. Lett..

[23]  Ingo Wald,et al.  Ray tracing deformable scenes using dynamic bounding volume hierarchies , 2007, TOGS.

[24]  Vlastimil Havran,et al.  Analysis of Cache Sensitive Representation for Binary Space Partitioning Trees , 1999, Informatica.