Overview and comparison of OpenCL and CUDA technology for GPGPU

GPU (Graphics Processing Unit) has a great impact on computing field. To enhance the performance of computing systems, researchers and developers use the parallel computing architecture of GPU. On the other hand, to reduce the development time of new products, two programming models are included in GPU, which are OpenCL (Open Computing Language) and CUDA (Compute Unified Device Architecture). The benefit of involving the two programming models in GPU is that researchers and developers don't have to understand OpenGL, DirectX or other program design, but can use GPU through simple programming language. OpenCL is an open standard API, which has the advantage of cross-platform. CUDA is a parallel computer architecture developed by NVIDIA, which includes Runtime API and Driver API. Compared with OpenCL, CUDA is with better performance. In this paper, we used plenty of similar kernels to compare the computing performance of C, OpenCL and CUDA, the two kinds of API's on NVIDIA Quadro 4000 GPU. The experimental result showed that, the executive time of CUDA Driver API was 94.9%~99.0% faster than that of C, while and the executive time of CUDA Driver API was 3.8%~5.4% faster than that of OpenCL. Accordingly, the cross-platform characteristic of OpenCL did not affect the performance of GPU.