Scaling of 3D game engine workloads on modern multi-GPU systems

This work supposes a first attempt to characterize the 3D game workload running on commodity multi-GPU systems. Depending on the rendering workload balance mode used, the intra and interframe dependencies due to render-to-texture require a number of synchronizations that can significantly impact the scalability with multiple GPUs. In this paper, a proprietary analytical tool called EMPATHY has been used to evaluate, for a set popular DX9 games, the performance of both classic split frame and alternate frame rendering modes as well as combined modes supporting more than 4 GPUs. We have also evaluated the application of the early copy and concurrent update techniques together as alternative to delayed surface copy of render-to-texture surfaces, showing a 48% percent improvement for some workloads.

[1]  Henry Fuchs,et al.  A sorting classification of parallel rendering , 1994, IEEE Computer Graphics and Applications.

[2]  Fadi N. Sibai 3D Graphics Performance Scaling and Workload Decomposition and Analysis , 2007, 6th IEEE/ACIS International Conference on Computer and Information Science (ICIS 2007).

[3]  Bernd Fröhlich,et al.  Multi-Frame Rate Rendering and Display , 2007, 2007 IEEE Virtual Reality Conference.

[4]  Homan Igehy,et al.  Pomegranate: a fully scalable graphics architecture , 2000, SIGGRAPH.

[5]  John D. Owens,et al.  GPU Computing , 2008, Proceedings of the IEEE.

[6]  Scott Whitman A task adaptive parallel graphics renderer , 1993 .

[7]  John S. Montrym,et al.  InfiniteReality: a real-time graphics system , 1997, SIGGRAPH.

[8]  Greg Humphreys,et al.  Chromium: a stream-processing framework for interactive rendering on clusters , 2002, SIGGRAPH.

[9]  John G. Eyles,et al.  PixelFlow: high-speed rendering using image composition , 1992, SIGGRAPH.

[10]  John D. Owens,et al.  Distributed texture memory in a multi-GPU environment , 2006, GH '06.