Towards Automating Multi-dimensional Data Decomposition for Executing a Single-GPU Code on a Multi-GPU System
暂无分享,去创建一个
[1] Thomas B. Jablin,et al. Automatic Parallelization of Kernels in Shared-Memory Multi-GPU Nodes , 2015, ICS.
[2] Thomas Ertl,et al. A Compute Unified System Architecture for Graphics Clusters Incorporating Data Locality , 2009, IEEE Transactions on Visualization and Computer Graphics.
[3] Jungwon Kim,et al. Achieving a single compute device image in OpenCL for multiple GPUs , 2011, PPoPP '11.
[4] Scott A. Mahlke,et al. Transparent CPU-GPU collaboration for data-parallel kernels on heterogeneous systems , 2013, Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques.
[5] Arun K. Somani,et al. Automatic Parallelization of GPU Applications Using OpenCL , 2015, 2015 Asia-Pacific Conference on Computer Aided System Engineering.
[6] Feng Ji,et al. RSVM: A Region-based Software Virtual Memory for GPU , 2013, Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques.
[7] Jungwon Kim,et al. SnuCL: an OpenCL framework for heterogeneous CPU/GPU clusters , 2012, ICS '12.
[8] Anuj Agarwal,et al. Analysis of sleep traits in knockout mice from the large-scale KOMP2 population using a non-invasive, high-throughput piezoelectric system , 2015, BMC Bioinformatics.
[9] Rudolf Eigenmann,et al. OpenMPC: extended OpenMP for efficient programming and tuning on GPUs , 2013, Int. J. Comput. Sci. Eng..
[10] Fumihiko Ino,et al. Accelerating the Smith-Waterman algorithm with interpair pruning and band optimization for the all-pairs comparison of base sequences , 2015, BMC Bioinformatics.
[11] Scott A. Mahlke,et al. VAST: The illusion of a large memory space for GPUs , 2014, 2014 23rd International Conference on Parallel Architecture and Compilation (PACT).
[12] Fumihiko Ino,et al. PACC : An Extension of OpenACC for Pipelined Processing of Large Data on a GPU , 2014 .
[13] John D. Owens,et al. GPU Computing , 2008, Proceedings of the IEEE.
[14] Fumihiko Ino,et al. GPU-Chariot: A Programming Framework for Stream Applications Running on Multi-GPU Systems , 2013, IEICE Trans. Inf. Syst..
[15] Hyesoon Kim,et al. Qilin: Exploiting parallelism on heterogeneous multiprocessors with adaptive mapping , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[16] Rohit Chandra,et al. Parallel programming in openMP , 2000 .
[17] Fumihiko Ino,et al. Improving cache locality for GPU-based volume rendering , 2014, Parallel Comput..
[18] Jonathan Blancas,et al. NVIDIA GeForce GTX 980 Ti, análisis , 2015 .