Two-phase load distribution for rendering large 3D models on a graphics cluster

In this paper we address the problem of distributing rendering computations for real-time display of very large 3D models using a graphics cluster. With a programmable graphics processing unit (GPU) in each node, rendering computations are increasingly carried out in two phases using two separate GPU programs: a vertex shader program for vertex (geometry) processing and a fragment shader program for pixel (color) processing. With fragment shader programs becoming more and more time consuming for increased realism and special visual effects, distributing load solely based on geometry as is done in most contemporary systems can cause significant load imbalance. There is often only a weak correlation between geometry and pixel data distribution, due to multiple factors such as occlusion of objects behind, by objects in front. Clearly, load balancing for geometry processing or pixel processing alone is not optimal. In this paper, we present a novel in-frame two-phase load-balancing technique that distributes data first for geometry and then for pixel processing. The technique is implemented on a graphics cluster and experimental results demonstrate considerable improvements in rendering performance.

[1]  Kwan-Liu Ma,et al.  Parallel volume rendering using binary-swap compositing , 1994, IEEE Computer Graphics and Applications.

[2]  Thu D. Nguyen,et al.  Scheduling policies to support distributed 3D multimedia applications , 1998, SIGMETRICS '98/PERFORMANCE '98.

[3]  Jiaoying Shi,et al.  Parallel-SG: research of parallel graphics rendering system on PC-Cluster , 2006, VRCIA '06.

[4]  Yan Li,et al.  A Parallel Framework for Interactive Rendering of Massive Complex Scenes on PCs Cluster , 2007, Fourth International Conference on Image and Graphics (ICIG 2007).

[5]  Qing Xu,et al.  Multi-sensor Satellite Image Sub-pixel Registration , 2007, Fourth International Conference on Image and Graphics (ICIG 2007).

[6]  Thomas A. Funkhouser,et al.  Hybrid sort-first and sort-last parallel rendering with a cluster of PCs , 2000, Workshop on Graphics Hardware.

[7]  Sudhir P. Mudur,et al.  Functionality distribution for parallel rendering , 2005, 19th IEEE International Parallel and Distributed Processing Symposium.

[8]  Werner Purgathofer,et al.  Coherence in Computer Graphics , 1992 .

[9]  Naohisa Sakamoto,et al.  Hybrid hardware-accelerated image composition for sort-last parallel rendering on graphics clusters with commodity image compositor , 2004 .

[10]  Timo Aila,et al.  Delay streams for graphics hardware , 2003, ACM Trans. Graph..

[11]  Henry Fuchs,et al.  A sorting classification of parallel rendering , 1994, IEEE Computer Graphics and Applications.

[12]  Naohisa Sakamoto,et al.  Hybrid hardware-accelerated image composition for sort-last parallel rendering on graphics clusters with commodity image compositor , 2004, 2004 IEEE Symposium on Volume Visualization and Graphics.

[13]  William J. Blanke The Metabuffer: A Scalable Multiresolution Multidisplay 3-D Graphics System Using Commodity Rendering Engines , 2000 .