An investigation of Unified Memory Access performance in CUDA
暂无分享,去创建一个
Raphael Landaverde | Martin C. Herbordt | Ayse K. Coskun | Tiansheng Zhang | M. Herbordt | A. Coskun | Tiansheng Zhang | Raphael Landaverde
[1] Margaret Martonosi,et al. Reducing GPU offload latency via fine-grained CPU-GPU synchronization , 2013, 2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA).
[2] Bormin Huang,et al. GPU Acceleration of Predictive Partitioned Vector Quantization for Ultraspectral Sounder Data Compression , 2011, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.
[3] Erik Lindholm,et al. NVIDIA Tesla: A Unified Graphics and Computing Architecture , 2008, IEEE Micro.
[4] Martin C. Herbordt,et al. GPU acceleration of a production molecular docking code , 2009, GPGPU-2.
[5] Kevin Skadron,et al. Rodinia: A benchmark suite for heterogeneous computing , 2009, 2009 IEEE International Symposium on Workload Characterization (IISWC).
[6] Murat Efe Guney,et al. On the limits of GPU acceleration , 2010 .
[7] Manish Vachharajani,et al. GPU acceleration of numerical weather prediction , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing.