Simultaneous Multikernel: Fine-Grained Sharing of GPUs
暂无分享,去创建一个
Rami G. Melhem | Minyi Guo | Jun Yang | Youtao Zhang | Bruce R. Childers | Zhenning Wang | Jun Yang | Youtao Zhang | M. Guo | R. Melhem | B. Childers | Zhenning Wang
[1] John Kim,et al. Improving GPGPU resource utilization through alternative thread block scheduling , 2014, 2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA).
[2] Shinpei Kato,et al. TimeGraph: GPU Scheduling for Real-Time Multi-Tasking Environments , 2011, USENIX Annual Technical Conference.
[3] Henry Wong,et al. Analyzing CUDA workloads using a detailed GPU simulator , 2009, 2009 IEEE International Symposium on Performance Analysis of Systems and Software.
[4] Mateo Valero,et al. Enabling preemptive multiprogramming on GPUs , 2014, 2014 ACM/IEEE 41st International Symposium on Computer Architecture (ISCA).
[5] Scott A. Mahlke,et al. Chimera: Collaborative Preemption for Multitasking on a Shared GPU , 2015, ASPLOS.
[6] R. Govindarajan,et al. Improving GPGPU concurrency with elastic kernels , 2013, ASPLOS '13.
[7] Nam Sung Kim,et al. The case for GPGPU spatial multitasking , 2012, IEEE International Symposium on High-Performance Comp Architecture.
[8] Stijn Eyerman,et al. System-Level Performance Metrics for Multiprogram Workloads , 2008, IEEE Micro.
[9] Wei Yi,et al. Kernel Fusion: An Effective Method for Better Power Efficiency on Multithreaded GPU , 2010, 2010 IEEE/ACM Int'l Conference on Green Computing and Communications & Int'l Conference on Cyber, Physical and Social Computing.
[10] Kevin Skadron,et al. Fine-grained resource sharing for concurrent GPGPU kernels , 2012, HotPar'12.
[11] Babak Falsafi,et al. Reference idempotency analysis: a framework for optimizing speculative execution , 2001, PPoPP '01.
[12] Michael L. Scott,et al. Disengaged scheduling for fair, protected access to fast computational accelerators , 2014, ASPLOS.
[13] Benjamin Hindman,et al. Dominant Resource Fairness: Fair Allocation of Multiple Resource Types , 2011, NSDI.
[14] Wen-mei W. Hwu,et al. Parboil: A Revised Benchmark Suite for Scientific and Commercial Throughput Computing , 2012 .
[15] Dong Li,et al. Enabling and Exploiting Flexible Task Assignment on GPU through SM-Centric Program Transformations , 2015, ICS.