GRAMPS: A programming model for graphics pipelines

We introduce GRAMPS, a programming model that generalizes concepts from modern real-time graphics pipelines by exposing a model of execution containing both fixed-function and application-programmable processing stages that exchange data via queues. GRAMPS allows the number, type, and connectivity of these processing stages to be defined by software, permitting arbitrary processing pipelines or even processing graphs. Applications achieve high performance using GRAMPS by expressing advanced rendering algorithms as custom pipelines, then using the pipeline as a rendering engine. We describe the design of GRAMPS, then evaluate it by implementing three pipelines, that is, Direct3D, a ray tracer, and a hybridization of the two, and running them on emulations of two different GRAMPS implementations: a traditional GPU-like architecture and a CPU-like multicore architecture. In our tests, our GRAMPS schedulers run our pipelines with 500 to 1500KB of queue usage at their peaks.

[1]  Erik Lindholm,et al.  NVIDIA Tesla: A Unified Graphics and Computing Architecture , 2008, IEEE Micro.

[2]  Edward T. Grochowski,et al.  Larrabee: A many-Core x86 architecture for visual computing , 2008, 2008 IEEE Hot Chips 20 Symposium (HCS).

[3]  Christopher J. Hughes,et al.  Carbon: architectural support for fine-grained parallelism on chip multiprocessors , 2007, ISCA '07.

[4]  Tomas Akenine-Möller,et al.  PCU: the programmable culling unit , 2007, ACM Trans. Graph..

[5]  Aaron E. Lefohn,et al.  Multi-fragment effects on the GPU using the k-buffer , 2007, SI3D.

[6]  John Hart,et al.  ACM Transactions on Graphics , 2004, SIGGRAPH 2004.

[7]  David Blythe The Direct3D 10 system , 2006, ACM Trans. Graph..

[8]  William J. Dally,et al.  The Imagine Stream Processor , 2002, Proceedings. IEEE International Conference on Computer Design: VLSI in Computers and Processors.

[9]  Kunle Olukotun,et al.  Niagara: a 32-way multithreaded Sparc processor , 2005, IEEE Micro.

[10]  William Thies,et al.  StreamIt: A Language for Streaming Applications , 2002, CC.

[11]  Pat Hanrahan,et al.  Ray tracing on a stream processor , 2004 .

[12]  Roy Hall,et al.  A Testbed for Realistic Image Synthesis , 1983, IEEE Computer Graphics and Applications.

[13]  Jan Kautz,et al.  Packet-based whitted and distribution ray tracing , 2007, GI '07.

[14]  Jiawen Chen,et al.  A reconfigurable architecture for load-balanced rendering , 2005, HWWS '05.

[15]  S. Asano,et al.  The design and implementation of a first-generation CELL processor , 2005, ISSCC. 2005 IEEE International Digest of Technical Papers. Solid-State Circuits Conference, 2005..

[16]  William J. Dally,et al.  Compiling for stream processing , 2006, 2006 International Conference on Parallel Architectures and Compilation Techniques (PACT).

[17]  Pat Hanrahan,et al.  Brook for GPUs: stream computing on graphics hardware , 2004, ACM Trans. Graph..

[18]  David Tarditi,et al.  Accelerator: using data parallelism to program GPUs for general-purpose uses , 2006, ASPLOS XII.

[19]  Jung Ho Ahn,et al.  Merrimac: Supercomputing with Streams , 2003, ACM/IEEE SC 2003 Conference (SC'03).

[20]  Michael D. McCool,et al.  Shader algebra , 2004, ACM Trans. Graph..

[21]  Pat Hanrahan,et al.  Interactive k-d tree GPU raytracing , 2007, SI3D.

[22]  Tim Foley,et al.  KD-tree acceleration structures for a GPU raytracer , 2005, HWWS '05.

[23]  William J. Dally,et al.  Comparing Reyes and OpenGL on a stream architecture , 2002, HWWS '02.

[24]  Bryan Chan,et al.  Shader algebra , 2004, SIGGRAPH 2004.