CTA-Aware Dynamic Scheduling Scheme for Streaming Multiprocessors in High-Performance GPUs
暂无分享,去创建一个
Jong-Myon Kim | Cheol Hong Kim | Dong Oh Son | Cong Thuan Do | Hong Jun Choi | Jaehyung Park | Jaehyung Park | Jong-Myon Kim | C. Kim | H. Choi | D. Son
[1] Pradeep Dubey,et al. Debunking the 100X GPU vs. CPU myth: an evaluation of throughput computing on CPU and GPU , 2010, ISCA.
[2] Mahmut T. Kandemir,et al. Application-aware Memory System for Fair and Efficient Execution of Concurrent GPGPU Applications , 2014, GPGPU@ASPLOS.
[3] Vikas Agarwal,et al. Clock rate versus IPC: the end of the road for conventional microarchitectures , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).
[4] Jung Ho Ahn,et al. McPAT: An integrated power, area, and timing modeling framework for multicore and manycore architectures , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[5] Nam Sung Kim,et al. GPUWattch: enabling energy optimizations in GPGPUs , 2013, ISCA.
[6] Henry Wong,et al. Analyzing CUDA workloads using a detailed GPU simulator , 2009, 2009 IEEE International Symposium on Performance Analysis of Systems and Software.
[7] J. E. Thornton,et al. Parallel operation in the control data 6600 , 1964, AFIPS '64 (Fall, part II).
[8] Pat Hanrahan,et al. Brook for GPUs: stream computing on graphics hardware , 2004, SIGGRAPH 2004.
[9] John D. Owens,et al. General Purpose Computation on Graphics Hardware , 2005, IEEE Visualization.