SGRT: a mobile GPU architecture for real-time ray tracing

Recently, with the increasing demand for photorealistic graphics and the rapid advances in desktop CPUs/GPUs, real-time ray tracing has attracted considerable attention. Unfortunately, ray tracing in the current mobile environment is very difficult because of inadequate computing power, memory bandwidth, and flexibility in mobile GPUs. In this paper, we present a novel mobile GPU architecture called SGRT (Samsung reconfigurable GPU based on Ray Tracing) in which a fast compact hardware accelerator and a flexible programmable shader are combined. SGRT has two key features: 1) an area-efficient parallel pipelined traversal unit; and 2) flexible and high-performance kernels for shading and ray generation. Simulation results show that SGRT is potentially a versatile graphics solution for future application processors as it provides a real-time ray tracing performance at full HD resolution that can compete with that of existing desktop GPU ray tracers. Our system is implemented on an FPGA platform, and mobile ray tracing is successfully demonstrated.

[1]  Young-Jun Kim,et al.  A Reconfigurable SIMT Processor for Mobile Ray Tracing With Contention Reduction in Shared Memory , 2013, IEEE Transactions on Circuits and Systems I: Regular Papers.

[2]  William J. Dally,et al.  Memory access scheduling , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).

[3]  Sanjay J. Patel,et al.  Tradeoffs in designing accelerator architectures for visual computing , 2008, 2008 41st IEEE/ACM International Symposium on Microarchitecture.

[4]  G. Greiner,et al.  Multi bounding volume hierarchies , 2008, 2008 IEEE Symposium on Interactive Ray Tracing.

[5]  Norman P. Jouppi,et al.  Optimizing NUCA Organizations and Wiring Alternatives for Large Caches with CACTI 6.0 , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).

[6]  Tack-Don Han,et al.  T&I engine: traversal and intersection engine for hardware accelerated ray tracing , 2011, SA '11.

[7]  Do-Hyung Kim,et al.  Low-power video decoding system using a reconfigurable processor , 2012, 2012 IEEE International Conference on Consumer Electronics (ICCE).

[8]  Philipp Slusallek,et al.  RPU: a programmable ray processing unit for realtime ray tracing , 2005, ACM Trans. Graph..

[9]  Charles T. Loop,et al.  Fast Ray Sorting and Breadth‐First Packet Traversal for GPU Ray Tracing , 2010, Comput. Graph. Forum.

[10]  Ingo Wald,et al.  Fast, parallel, and asynchronous construction of BVHs for ray tracing animated scenes , 2008, Comput. Graph..

[11]  Kunle Olukotun,et al.  The Future of Microprocessors , 2005, ACM Queue.

[12]  Youngsam Shin,et al.  SGRT: a scalable mobile GPU architecture based on ray tracing , 2012, SIGGRAPH Talks.

[13]  Rudy Lauwereins,et al.  Exploiting Loop-Level Parallelism on Coarse-Grained Reconfigurable Architectures Using Modulo Scheduling , 2003, DATE.

[14]  Tack-Don Han,et al.  MobiRT: an implementation of OpenGL ES-based CPU-GPU hybrid ray tracer for mobile devices , 2010, SIGGRAPH ASIA.

[15]  Daniel Kopta,et al.  Efficient MIMD architectures for high-performance ray tracing , 2010, 2010 IEEE International Conference on Computer Design.

[16]  Timo Aila,et al.  Understanding the efficiency of ray traversal on GPUs , 2009, High Performance Graphics.

[17]  P. Slusallek,et al.  RPU: a programmable ray processing unit for realtime ray tracing , 2005, SIGGRAPH '05.

[18]  S. Woop Embree: Photo-Realistic Ray Tracing Kernels , 2011 .

[19]  John A. Tsakok Faster incoherent rays: Multi-BVH ray stream tracing , 2009, High Performance Graphics.

[20]  Tack-Don Han,et al.  Parallel-pipeline-based traversal unit for hardware-accelerated ray tracing , 2012, SA '12.

[21]  Daniel Kopta,et al.  TRaX: A Multicore Hardware Architecture for Real-Time Ray Tracing , 2009, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[22]  Hannes Kaufmann,et al.  High-quality reflections, refractions, and caustics in Augmented Reality and their contribution to visual coherence , 2012, 2012 IEEE International Symposium on Mixed and Augmented Reality (ISMAR).

[23]  Ingo Wald,et al.  Fast Construction of SAH BVHs on the Intel Many Integrated Core (MIC) Architecture , 2012, IEEE Transactions on Visualization and Computer Graphics.

[24]  Christoforos E. Kozyrakis,et al.  Understanding sources of inefficiency in general-purpose chips , 2010, ISCA.

[25]  Ingo Wald,et al.  Realtime ray tracing and interactive global illumination , 2004, Ausgezeichnete Informatikdissertationen.

[26]  Henry Wong,et al.  Analyzing CUDA workloads using a detailed GPU simulator , 2009, 2009 IEEE International Symposium on Performance Analysis of Systems and Software.

[27]  Jeong-Soo Park,et al.  The design of a texture mapping unit with effective MIP-map level selection for real-time ray tracing , 2011, IEICE Electron. Express.

[28]  I. Wald,et al.  On fast Construction of SAH-based Bounding Volume Hierarchies , 2007, 2007 IEEE Symposium on Interactive Ray Tracing.

[29]  Daniel Kopta,et al.  A Mobile Accelerator Architecture for Ray Tracing , 2012 .

[30]  Philipp Slusallek,et al.  Realtime ray tracing of dynamic scenes on an FPGA chip , 2004, Graphics Hardware.

[31]  Youngsam Shin,et al.  SGRT: a scalable mobile GPU architecture based on ray tracing , 2012, SIGGRAPH '12.

[32]  C.P. Gribble,et al.  Coherent ray tracing via stream filtering , 2008, 2008 IEEE Symposium on Interactive Ray Tracing.