论文信息 - A Highly Efficient Implementation of I/O Functions on GPU

A Highly Efficient Implementation of I/O Functions on GPU

The API interfaces provided by CUDA can help programmers develop CUDA applications and get high performance in GPU. However, many of the I/O operations are not supported in device codes. This paper has implemented most of the I/O functions through host's agent by using the characteristics of mapped memory in CUDA, such as read/write file and 'printf'. The methods that used to implement these I/O functions will not affect the performance of original applications, users' I/O requirements can be responded quickly, even more, the performance of 'printf' implemented in this paper is higher than that provided by CUDA. This paper supports easy and effective real-time debug method to GPU users, the research in this paper can improve productivity of converting legacy C/C++ codes to CUDA codes, and it is a valuable investigation for broadening CUDA's functions.

Wei Wu | Shan-Shan Wang | Feng Bin Qi | Wang Quan He

[1] Jie Cheng,et al. CUDA by Example: An Introduction to General-Purpose GPU Programming , 2010, Scalable Comput. Pract. Exp..

[2] Jie Cheng,et al. Programming Massively Parallel Processors. A Hands-on Approach , 2010, Scalable Comput. Pract. Exp..

[3] Thomas Ertl,et al. A Compute Unified System Architecture for Graphics Clusters Incorporating Data Locality , 2009, IEEE Transactions on Visualization and Computer Graphics.

[4] Hiroaki Kobayashi,et al. CheCUDA: A Checkpoint/Restart Tool for CUDA Applications , 2009, 2009 International Conference on Parallel and Distributed Computing, Applications and Technologies.

[5] Sudhakar Yalamanchili,et al. Ocelot: A dynamic optimization framework for bulk-synchronous applications in heterogeneous systems , 2010, 2010 19th International Conference on Parallel Architectures and Compilation Techniques (PACT).