Multi-threaded Kernel Offloading to GPGPU Using Hyper-Q on Kepler Architecture
暂无分享,去创建一个
[1] Tarek A. El-Ghazawi,et al. Exploiting concurrent kernel execution on graphic processing units , 2011, 2011 International Conference on High Performance Computing & Simulation.
[2] Joseph Zambreno,et al. Increasing GPU throughput using kernel interleaved thread block scheduling , 2013, 2013 IEEE 31st International Conference on Computer Design (ICCD).
[3] R. Govindarajan,et al. Improving GPGPU concurrency with elastic kernels , 2013, ASPLOS '13.
[4] T. Steinke,et al. On Improving the Performance of Multi-threaded CUDA Applications with Concurrent Kernel Execution by Kernel Reordering , 2012, 2012 Symposium on Application Accelerators in High Performance Computing.
[5] Vikram K. Narayana,et al. Scaling scientific applications on clusters of hybrid multicore/GPU nodes , 2011, CF '11.
[6] Long Chen,et al. Dynamic load balancing on single- and multi-GPU systems , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS).
[7] Long Chen,et al. Exploring Fine-Grained Task-Based Execution on Multi-GPU Systems , 2011, 2011 IEEE International Conference on Cluster Computing.
[8] Kevin Skadron,et al. Enabling Task Parallelism in the CUDA Scheduler , 2009 .
[9] Wen-mei W. Hwu,et al. GPU Computing Gems Jade Edition , 2011 .
[10] Klaus Schulten,et al. Adapting a message-driven parallel application to GPU-accelerated clusters , 2008, 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis.
[11] Wen-mei W. Hwu,et al. GPU Computing Gems Emerald Edition , 2011 .