Improving SIMD efficiency for parallel Monte Carlo light transport on the GPU

Monte Carlo Light Transport algorithms such as Path Tracing (PT), Bi-Directional Path Tracing (BDPT) and Metropolis Light Transport (MLT) make use of random walks to sample light transport paths. When parallelizing these algorithms on the GPU the stochastic termination of random walks results in an uneven workload between samples, which reduces SIMD efficiency. In this paper we propose to combine stream compaction and sample regeneration to keep SIMD efficiency high during random walk construction, in spite of stochastic termination. Furthermore, for BDPT and MLT, we propose to evaluate all bidirectional connections of a sample in parallel in order to balance the workload between GPU threads and improve SIMD efficiency during sample evaluation. We present efficient parallel GPU-only implementations for PT, BDPT, and MLT in CUDA. We show that our GPU implementations outperform similar CPU implementations by an order of magnitude.

[1]  Pat Hanrahan,et al.  Ray tracing on programmable graphics hardware , 2002, SIGGRAPH Courses.

[2]  Yves D. Willems,et al.  Rendering Participating Media with Bidirectional Path Tracing , 1996, Rendering Techniques.

[3]  Pat Hanrahan,et al.  Ray tracing on a connection machine , 1988, ICS '88.

[4]  Markus Wagner,et al.  Interactive Rendering with Coherent Ray Tracing , 2001, Comput. Graph. Forum.

[5]  Luís Paulo Santos,et al.  Instant Global Illumination on the GPU using OptiX , 2010 .

[6]  Charles T. Loop,et al.  Fast Ray Sorting and Breadth‐First Packet Traversal for GPU Ray Tracing , 2010, Comput. Graph. Forum.

[7]  Werner Purgathofer,et al.  On The Start-Up Bias Problem Of Metropolis Sampling , 1999 .

[8]  James F. Blinn,et al.  Models of light reflection for computer synthesized pictures , 1977, SIGGRAPH.

[9]  Hans-Peter Seidel,et al.  Stackless KD‐Tree Traversal for High Performance GPU Ray Tracing , 2007, Comput. Graph. Forum.

[10]  Kellogg S. Booth,et al.  Report from the chair , 1986 .

[11]  Leonidas J. Guibas,et al.  Optimally combining sampling techniques for Monte Carlo rendering , 1995, SIGGRAPH.

[12]  Pierre Poulin,et al.  Combinatorial Bidirectional Path‐Tracing for Efficient Hybrid CPU/GPU Rendering , 2011, Comput. Graph. Forum.

[13]  Yao Zhang,et al.  Scan primitives for GPU computing , 2007, GH '07.

[14]  J. Kulpa,et al.  Time-frequency analysis using NVIDIA compute unified device architecture (CUDA) , 2009, Symposium on Photonics Applications in Astronomy, Communications, Industry, and High-Energy Physics Experiments (WILGA).

[15]  Dietger van Antwerpen,et al.  Recursive MIS Computation for Streaming BDPT on the GPU , 2012 .

[16]  Csaba Kelemen,et al.  Simple and Robust Mutation Strategy for Metropolis Light Transport Algorithm , 2001 .

[17]  Yves D. Willems,et al.  Bi-directional path tracing , 1993 .

[18]  Vlastimil Havran,et al.  Path Regeneration for Interactive Path Tracing , 2010, Eurographics.

[19]  Leonidas J. Guibas,et al.  Metropolis light transport , 1997, SIGGRAPH.

[20]  Turner Whitted,et al.  An improved illumination model for shaded display , 1979, CACM.

[21]  Timo Aila,et al.  Understanding the efficiency of ray traversal on GPUs , 2009, High Performance Graphics.

[22]  Pat Hanrahan,et al.  Interactive k-d tree GPU raytracing , 2007, SI3D.