VectorPU: A Generic and Efficient Data-container and Component Model for Transparent Data Transfer on GPU-based Heterogeneous Systems
暂无分享,去创建一个
[1] Cédric Augonnet,et al. StarPU: a unified platform for task scheduling on heterogeneous multicore architectures , 2011, Concurr. Comput. Pract. Exp..
[2] Rudolf Eigenmann,et al. OpenMPC: Extended OpenMP Programming and Tuning for GPUs , 2010, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis.
[3] Christoph W. Kessler,et al. XPDL: Extensible Platform Description Language to Support Energy Modeling and Optimization , 2015, 2015 44th International Conference on Parallel Processing Workshops.
[4] Clemens Grelck,et al. Towards Heterogeneous Computing without Heterogeneous Programming , 2012, Trends in Functional Programming.
[5] Vivek Sarkar,et al. Compiling and Optimizing Java 8 Programs for GPU Execution , 2015, 2015 International Conference on Parallel Architecture and Compilation (PACT).
[6] Dong Li,et al. Interactive Program Debugging and Optimization for Directive-Based, Efficient GPU Computing , 2014, 2014 IEEE 28th International Parallel and Distributed Processing Symposium.
[7] Feng Liu,et al. Dynamically managed data for CPU-GPU architectures , 2012, CGO '12.
[8] R. Govindarajan,et al. Fast and efficient automatic memory management for GPUs using compiler-assisted runtime coherence scheme , 2012, 2012 21st International Conference on Parallel Architectures and Compilation Techniques (PACT).
[9] John E. Stone,et al. An asymmetric distributed shared memory model for heterogeneous parallel systems , 2010, ASPLOS XV.
[10] Christoph W. Kessler,et al. Smart Containers and Skeleton Programming for GPU-Based Systems , 2015, International Journal of Parallel Programming.
[11] Christoph Kessler,et al. MeterPU: A Generic Measurement Abstraction API Enabling Energy-Tuned Skeleton Backend Selection , 2015, TrustCom 2015.
[12] David I. August,et al. Automatic CPU-GPU communication management and optimization , 2011, PLDI '11.
[13] Raphael Landaverde,et al. An investigation of Unified Memory Access performance in CUDA , 2014, 2014 IEEE High Performance Extreme Computing Conference (HPEC).
[14] Milind Kulkarni,et al. SemCache: semantics-aware caching for efficient GPU offloading , 2016, ICS '13.