Hardware implementation of micropolygon rasterization with motion and defocus blur

Current GPUs rasterize micropolygons (polygons approximately one pixel in size) inefficiently. Additionally, they do not natively support triangle rasterization with jittered sampling, defocus, or motion blur. We perform a microarchitectural study of fixed-function micropolygon rasterization using custom circuits. We present three rasterization designs: the first optimized for triangle micropolygons that are not blurred, a second for stochastic rasterization of micropolygons with motion and defocus blur, and third that is a hybrid combination of the two. Our designs achieve high area and power efficiency by using low-precision operations and rasterizing pairs of adjacent triangles in parallel. We demonstrate optimized designs synthesized in a 45 nm process showing that a micropolygon rasterization unit with a throughput of 3 billion micropolygons per second would consume 2.9 W and occupy 4.1 mm2 which is 0.77% of the die area of a GeForce GTX 480 GPU.

[1]  Anjul Patney,et al.  Real-time Reyes-style adaptive surface subdivision , 2008, SIGGRAPH Asia '08.

[2]  Kurt Akeley,et al.  Reality Engine graphics , 1993, SIGGRAPH.

[3]  Tomas Akenine-Möller,et al.  Stochastic rasterization using time-continuous triangles , 2007, GH '07.

[4]  David Blythe The Direct3D 10 system , 2006, ACM Trans. Graph..

[5]  Anjul Patney,et al.  Real-time Reyes-style adaptive surface subdivision , 2008, SIGGRAPH 2008.

[6]  Pat Hanrahan,et al.  Reducing shading on GPUs using quad-fragment merging , 2010, SIGGRAPH 2010.

[7]  Mark Horowitz,et al.  Area-efficiency in CMP core design: co-optimization of microarchitecture and physical design , 2009, CARN.

[8]  Frederick P. Brooks,et al.  Fast spheres, shadows, textures, transparencies, and imgage enhancements in pixel-planes , 1985, Advances in Computer Graphics.

[9]  Ned Greene,et al.  Hierarchical polygon tiling with coverage masks , 1996, SIGGRAPH.

[10]  Kun Zhou,et al.  RenderAnts: interactive Reyes rendering on GPUs , 2009, SIGGRAPH 2009.

[11]  Pat Hanrahan,et al.  DiagSplit: parallel, crack-free, adaptive tessellation for micropolygon rendering , 2009, SIGGRAPH 2009.

[12]  Kun Zhou,et al.  RenderAnts: interactive Reyes rendering on GPUs , 2009, SIGGRAPH 2009.

[13]  Pat Hanrahan,et al.  DiagSplit: parallel, crack-free, adaptive tessellation for micropolygon rendering , 2009, ACM Trans. Graph..

[14]  Pat Hanrahan,et al.  Data-parallel rasterization of micropolygons with defocus and motion blur , 2009, High Performance Graphics.

[15]  Christoforos E. Kozyrakis,et al.  Understanding sources of inefficiency in general-purpose chips , 2010, ISCA.

[16]  Henry P. Moreton,et al.  Reducing shading on GPUs using quad-fragment merging , 2010 .

[17]  Henry Fuchs,et al.  Pixel-planes 5: a heterogeneous multiprocessor graphics system using processor-enhanced memories , 1989, SIGGRAPH.

[18]  Edward T. Grochowski,et al.  Larrabee: A many-Core x86 architecture for visual computing , 2008, 2008 IEEE Hot Chips 20 Symposium (HCS).

[19]  David Blythe The Direct3D 10 system , 2006, SIGGRAPH 2006.

[20]  Juan Pineda,et al.  A parallel algorithm for polygon rasterization , 1988, SIGGRAPH.

[21]  Bob McNamara,et al.  Tiled polygon traversal using half-plane edge functions , 2000, Workshop on Graphics Hardware.

[22]  Michael D. McCool,et al.  Incremental and hierarchical Hilbert order edge equation polygon rasterizatione , 2001, HWWS '01.

[23]  Charles T. Loop,et al.  Data-parallel Micropolygon Rasterization , 2010, Eurographics.

[24]  Charles T. Loop,et al.  Real-time view-dependent rendering of parametric surfaces , 2009, I3D '09.