论文信息 - CrystalGPU: Transparent and Efficient Utilization of GPU Power

CrystalGPU: Transparent and Efficient Utilization of GPU Power

General-purpose computing on graphics processing units (GPGPU) has recently gained considerable attention in various domains such as bioinformatics, databases and distributed computing. GPGPU is based on using the GPU as a co-processor accelerator to offload computationally-intensive tasks from the CPU. This study starts from the observation that a number of GPU features (such as overlapping communication and computation, short lived buffer reuse, and harnessing multi-GPU systems) can be abstracted and reused across different GPGPU applications. This paper describes CrystalGPU, a modular framework that transparently enables applications to exploit a number of GPU optimizations. Our evaluation shows that CrystalGPU enables up to 16x speedup gains on synthetic benchmarks, while introducing negligible latency overhead.

Matei Ripeanu | Abdullah Gharaibeh | Samer Al-Kiswany

[1] Matei Ripeanu,et al. StoreGPU: exploiting graphics processing units to accelerate distributed storage systems , 2008, HPDC '08.

[2] John D. Owens,et al. GPU Computing , 2008, Proceedings of the IEEE.

[3] Sotiris Ioannidis,et al. Gnort: High Performance Network Intrusion Detection Using Graphics Processors , 2008, RAID.

[4] Robert Strzodka,et al. Exploring weak scalability for FEM calculations on a GPU-enhanced cluster , 2007, Parallel Comput..

[5] Sean Quinlan,et al. Venti: A New Approach to Archival Storage , 2002, FAST.

[6] Neelam Goyal,et al. Signature Matching in Network Processing using SIMD / GPU Architectures , 2007 .

[7] Jens H. Krüger,et al. A Survey of General‐Purpose Computation on Graphics Hardware , 2007, Eurographics.

[8] Vijay S. Pande,et al. Folding@Home and Genome@Home: Using distributed computing to tackle previously intractable problem , 2009, 0901.0866.

[9] Wen-mei W. Hwu,et al. Optimization principles and application performance evaluation of a multithreaded GPU using CUDA , 2008, PPoPP.

[10] Gang Peng,et al. Multi-dimensional storage virtualization , 2004, SIGMETRICS '04/Performance '04.

[11] Michael D. McCool,et al. Metaprogramming GPUs with Sh , 2004 .

[12] Mark Oskin,et al. Using modern graphics architectures for general-purpose computing: a framework and analysis , 2002, MICRO 35.

[13] Jonas Tölke,et al. Implementation of a Lattice Boltzmann kernel using the Compute Unified Device Architecture developed by nVIDIA , 2009, Comput. Vis. Sci..

[14] Ben Y. Zhao,et al. OceanStore: an architecture for global-scale persistent storage , 2000, SIGP.