论文信息 - Accessing CUDA Features in the OpenGL Rendering Pipeline: A Case Study Using N-Body Simulation

Accessing CUDA Features in the OpenGL Rendering Pipeline: A Case Study Using N-Body Simulation

The advances of the graphics programing unit (GPU) architecture and its rapidly evolving towards general purpose GPU make a series of applications adopt a general purpose (GPGPU) and a graphics computing interoperability approach in which the first is used for heavy calculations and the second for 3D graphics rendering. Because GPGPU exposes several hardware features, such as shared memory and thread synchronization mechanism, it allows a developer to write more efficient code. Nevertheless, we conjecture that such hardware features are also available in the graphics computing interface OpenGL 4.5 or later through the graphics concepts: blending, transform feedback, tessellation and instancing. In this paper we assess our conjecture by implementing an N-body simulation with both approaches. We indeed devise a novel non-graphics application to the tessellation hardware and the instanced rendering circuit. Instead of refining a mesh, we use the abstract patch for gaining direct accesses to shared memory. In the place of drawing multiple objects, we apply the instanced rendering technology for improving sequential data accesses. Comparative timing analysis is provided. We believe that these results provide better understanding of the graphics features that are useful for closing the performance gap between OpenGL and a GPGPU architecture, and open a new perspective on implementing solely with the OpenGL graphics applications that require both intense, but pre-specified, memory accesses and 3D graphics rendering.

Shin-Ting Wu | Mario Santos Camillo

[1] Marc Olano,et al. Real-Time GPU Surface Curvature Estimation on Deforming Meshes and Volumetric Data Sets , 2012, IEEE Transactions on Visualization and Computer Graphics.

[2] 苏帅. 单卡之王 NVIDIA GeForce GTX 1080 , 2016 .

[3] Jochen Hunz. The Possibilities of Compute Shaders - an Analysis , 2013 .

[4] Eric Lengyel. Data‐Driven Sound Pack Loading and Organization , 2011 .

[5] Rhadamés Carmona,et al. Volume ray casting using different GPU based parallel APIs , 2016, 2016 XLII Latin American Computing Conference (CLEI).

[6] Jianbin Fang,et al. A Comprehensive Performance Comparison of CUDA and OpenCL , 2011, 2011 International Conference on Parallel Processing.

[7] Matthias Nießner,et al. Real‐Time Rendering Techniques with Hardware Tessellation , 2016, Comput. Graph. Forum.

[8] Pat Hanrahan,et al. Brook for GPUs: stream computing on graphics hardware , 2004, SIGGRAPH 2004.

[9] Marco Fratarcangeli. GPGPU Cloth Simulation Using GLSL, OpenCL and CUDA , 2011 .

[10] Pat Hanrahan,et al. Stream computing on graphics hardware , 2005 .

[11] Tzvetomir Ivanov Vassilev. Comparison of parallel algorithms for modelling mass-springs systems with several APIs on modern GPUs , 2011, CompSysTech '11.

[12] AngryCalc. NVIDIA GeForce GTX 1080 , 2018 .