Monte-Carlo integration techniques for global illumination are popular on GPUs thanks to their massive parallel architecture, but efficient implementation remains challenging. The use of randomly decorrelated low-discrepancy sequences in the path-tracing algorithm allows faster visual convergence. However, the parallel tracing of incoherent rays often results in poor memory cache utilization, reducing the ray bandwidth efficiency. Interleaved sampling [Keller et al. 2001] partially solves this problem, by using a small set of distributions split in coherent ray-tracing passes, but the solution is prone to structured noise. On the other hand, ray-reordering methods [Pharr et al. 1997] group stochastic rays into coherent ray packets but their implementation add an additional sorting cost on the GPU [Moon et al. 2010] [Garanzha and Loop 2010].
[1]
Hye-Sun Kim,et al.
Cache-oblivious ray reordering
,
2010,
TOGS.
[2]
Charles T. Loop,et al.
Fast Ray Sorting and Breadth‐First Packet Traversal for GPU Ray Tracing
,
2010,
Comput. Graph. Forum.
[3]
Alexander Keller,et al.
Efficient Multidimensional Sampling
,
2002,
Comput. Graph. Forum.
[4]
Mark Meyer,et al.
A theory of monte carlo visibility sampling
,
2012,
TOGS.
[5]
Pat Hanrahan,et al.
Rendering complex scenes with memory-coherent ray tracing
,
1997,
SIGGRAPH.
[6]
Wolfgang Heidrich,et al.
Interleaved Sampling
,
2001,
Rendering Techniques.