Monte-Carlo integration techniques for global illumination are popular on GPUs thanks to their massive parallel architecture, but efficient implementation remains challenging. The use of randomly decorrelated low-discrepancy sequences in the path-tracing algorithm allows faster visual convergence. However, the parallel tracing of incoherent rays often results in poor memory cache utilization, reducing the ray bandwidth efficiency. Interleaved sampling [Keller et al. 2001] partially solves this problem, by using a small set of distributions split in coherent ray-tracing passes, but the solution is prone to structured noise. On the other hand, ray-reordering methods [Pharr et al. 1997] group stochastic rays into coherent ray packets but their implementation add an additional sorting cost on the GPU [Moon et al. 2010] [Garanzha and Loop 2010].
Hye-Sun Kim,et al.
Cache-oblivious ray reordering
Charles T. Loop,et al.
Fast Ray Sorting and Breadth‐First Packet Traversal for GPU Ray Tracing
Comput. Graph. Forum.
Alexander Keller,et al.
Efficient Multidimensional Sampling
Comput. Graph. Forum.
Mark Meyer,et al.
A theory of monte carlo visibility sampling
Pat Hanrahan,et al.
Rendering complex scenes with memory-coherent ray tracing
Wolfgang Heidrich,et al.
Interleaved Sampling
Rendering Techniques.