论文信息 - Measuring the Impact of Conguration Parameters

Measuring the Impact of Conguration Parameters

The threadblock size and shape choice is one of the most important user decisions when a parallel problem is coded to run in GPU architectures. In fact, threadblock conguration has a signicant

Yuri Torres | Arturo Gonzalez-Escribano | Diego R. Llanos

[1] Hyesoon Kim,et al. An analytical model for a GPU architecture with memory-level and thread-level parallelism awareness , 2009, ISCA '09.

[2] Arturo González-Escribano,et al. Using Fermi Architecture Knowledge to Speed up CUDA and OpenCL Programs , 2012, 2012 IEEE 10th International Symposium on Parallel and Distributed Processing with Applications.

[3] Yao Zhang,et al. A quantitative performance analysis model for GPU architectures , 2011, 2011 IEEE 17th International Symposium on High Performance Computer Architecture.

[4] Yuri Torres,et al. Understanding the impact of CUDA tuning techniques for Fermi , 2011, 2011 International Conference on High Performance Computing & Simulation.

[5] Andreas Moshovos,et al. Demystifying GPU microarchitecture through microbenchmarking , 2010, 2010 IEEE International Symposium on Performance Analysis of Systems & Software (ISPASS).

[6] Xiaoming Li,et al. A Micro-benchmark Suite for AMD GPUs , 2010, 2010 39th International Conference on Parallel Processing Workshops.