Evaluating Multiple Streams on Heterogeneous Platforms
暂无分享,去创建一个
Canqun Yang | Xuhao Chen | Cheng Chen | Jianbin Fang | Tao Tang | Zhaokui Li | Peng Zhang | T. Tang | Canqun Yang | Jianbin Fang | Xuhao Chen | Cheng Chen | Peng Zhang | Zhaokui Li
[1] Hiroaki Kobayashi,et al. SPRAT: Runtime processor selection for energy-aware computing , 2008, 2008 IEEE International Conference on Cluster Computing.
[2] Eduard Ayguadé,et al. AMA: Asynchronous Management of Accelerators for Task-based Programming Models , 2015, ICCS.
[3] Scott B. Baden,et al. Modeling and predicting performance of high performance computing applications on hardware accelerators , 2012, 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum.
[4] José Ignacio Benavides Benítez,et al. Performance models for asynchronous data transfers on consumer Graphics Processing Units , 2012, J. Parallel Distributed Comput..
[5] Fumihiko Ino,et al. GPU-Chariot: A Programming Framework for Stream Applications Running on Multi-GPU Systems , 2013, IEICE Trans. Inf. Syst..
[6] Jiayuan Meng,et al. Improving GPU Performance Prediction with Data Transfer Modeling , 2013, 2013 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum.
[7] Nam Sung Kim,et al. The case for GPGPU spatial multitasking , 2012, IEEE International Symposium on High-Performance Comp Architecture.
[8] Kai Lu,et al. Adaptive Optimization for Petascale Heterogeneous CPU/GPU Computing , 2010, 2010 IEEE International Conference on Cluster Computing.
[9] J. Xu. OpenCL – The Open Standard for Parallel Programming of Heterogeneous Systems , 2009 .
[10] John D. Owens,et al. GPU Computing , 2008, Proceedings of the IEEE.
[11] Kim M. Hazelwood,et al. Where is the data? Why you cannot debate CPU vs. GPU performance without the answer , 2011, (IEEE ISPASS) IEEE INTERNATIONAL SYMPOSIUM ON PERFORMANCE ANALYSIS OF SYSTEMS AND SOFTWARE.
[12] Thomas Steinke,et al. Multi-threaded Kernel Offloading to GPGPU Using Hyper-Q on Kepler Architecture , 2014 .
[13] Alejandro Duran,et al. Heterogeneous Streaming , 2016, 2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW).
[14] David A. Patterson,et al. Computer Architecture: A Quantitative Approach , 1969 .
[15] Jeffrey S. Vetter,et al. A Survey of CPU-GPU Heterogeneous Computing Techniques , 2015, ACM Comput. Surv..
[16] Jason Maassen,et al. Performance Models for CPU-GPU Data Transfers , 2014, 2014 14th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing.
[17] Zheng Gong,et al. Software pipelining for graphic processing unit acceleration: Partition, scheduling and granularity , 2016, Int. J. High Perform. Comput. Appl..
[18] Chao Yang,et al. A peta-scalable CPU-GPU algorithm for global atmospheric simulations , 2013, PPoPP '13.
[19] Thomas Steinke,et al. Concurrent Kernel Execution on Xeon Phi within Parallel Heterogeneous Workloads , 2014, Euro-Par.
[20] Anand Raghunathan,et al. MDR: performance model driven runtime for heterogeneous parallel platforms , 2011, ICS '11.
[21] Kevin Skadron,et al. Rodinia: A benchmark suite for heterogeneous computing , 2009, 2009 IEEE International Symposium on Workload Characterization (IISWC).
[22] Jeffrey S. Vetter,et al. Maestro: Data Orchestration and Tuning for OpenCL Devices , 2010, Euro-Par.