SIMD Friendly Ray Tracing on GPU

In this paper, we present a novel BVH tracing method on GPU, which can achieve better SIMD utilization than traditional method. In the traditional way, thread usually sticks to a ray until the closest hit is found. When the threads of the same warp follow very divergent ray paths, SIMD utilization drops significantly. The idea of our method is to redefine the way of work distribution by binding the ray and the data to be tested together, in order to spread the computation of the single ray to multi threads. We also separate the tracing process into three steps to collect the work units of the same type and process them in a stream-like manner. The first step is ray traversal whose task is to do ray-box testing for the ray-node pairs. Its output is an stack of ray-triangle pairs, which is then fed to intersecting step to form an stack of ray-hit pairs. The last step is to use ray-hit pairs to update the closest hits for each ray of the same warp. The experiment shows our method can efficiently improve the SIMD utilization and result in less tracing time.

[1]  Kun Zhou,et al.  Real-time KD-tree construction on graphics hardware , 2008, SIGGRAPH Asia '08.

[2]  Charles D. Hansen,et al.  RTSAH Traversal Order for Occlusion Rays , 2011, Comput. Graph. Forum.

[3]  Nicolas Holzschuch,et al.  Whitted Ray-Tracing for Dynamic Scenes using a Ray-Space Hierarchy on the GPU , 2007, Rendering Techniques.

[4]  Philipp Slusallek,et al.  Interactive Global Illumination using Fast Ray Tracing , 2002, Rendering Techniques.

[5]  Charles T. Loop,et al.  Fast Ray Sorting and Breadth‐First Packet Traversal for GPU Ray Tracing , 2010, Comput. Graph. Forum.

[6]  Timo Aila,et al.  Understanding the efficiency of ray traversal on GPUs , 2009, High Performance Graphics.

[7]  Tim Foley,et al.  KD-tree acceleration structures for a GPU raytracer , 2005, HWWS '05.

[8]  Harry Shum,et al.  Sketching reality: Realistic interpretation of architectural designs , 2008, TOGS.

[9]  Pat Hanrahan,et al.  An efficient representation for irradiance environment maps , 2001, SIGGRAPH.

[10]  James Arvo,et al.  Fast ray tracing by ray classification , 1987, SIGGRAPH '87.

[11]  Pat Hanrahan,et al.  Interactive k-d tree GPU raytracing , 2007, SI3D.

[12]  Vlastimil Havran Ray Tracing with Rope Trees , 2013 .

[13]  Ingo Wald,et al.  Realtime ray tracing and interactive global illumination , 2004, Ausgezeichnete Informatikdissertationen.

[14]  David K. McAllister,et al.  OptiX: a general purpose ray tracing engine , 2010, ACM Trans. Graph..

[15]  Hans-Peter Seidel,et al.  Stackless KD‐Tree Traversal for High Performance GPU Ray Tracing , 2007, Comput. Graph. Forum.

[16]  Timo Aila,et al.  PantaRay: fast ray-traced occlusion caching of massive scenes , 2010, SIGGRAPH 2010.

[17]  Samuli Laine,et al.  Restart trail for stackless BVH traversal , 2010, HPG '10.

[18]  João Marcelo X. N. Teixeira,et al.  kD-Tree Traversal Implementations for Ray Tracing on Massive Multiprocessors: A Comparative Study , 2009, 2009 21st International Symposium on Computer Architecture and High Performance Computing.