Static Compilation Analysis for Host-Accelerator Communication Optimization
暂无分享,去创建一个
[1] Mehdi Amini,et al. A Particle-Mesh Integrator for Galactic Dynamics Powered by GPGPUs , 2009, ICCS.
[2] Pierre Jouvelot,et al. Semantical interprocedural parallelization: an overview of the PIPS project , 1991 .
[3] Rudolf Eigenmann,et al. OpenMPC: Extended OpenMP Programming and Tuning for GPUs , 2010, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis.
[4] Hiroki Honda,et al. OMPCUDA : OpenMP Execution Framework for CUDA Based on Omni OpenMP Compiler , 2010, IWOMP.
[5] Kevin Skadron,et al. HotSpot: a compact thermal modeling methodology for early-stage VLSI design , 2006, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.
[6] Michael Gerndt,et al. Optimizing Communication in Superb , 1990, CONPAR.
[7] Pierre Jouvelot,et al. PIPS Is not (just) Polyhedral Software Adding GPU Code Generation in PIPS , 2011 .
[8] François Irigoin,et al. Interprocedural Array Region Analyses , 1996, International Journal of Parallel Programming.
[9] P. Feautrier. Parametric integer programming , 1988 .
[10] Samuel Williams,et al. Stencil computation optimization and auto-tuning on state-of-the-art multicore architectures , 2008, 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis.
[11] Michael Wolfe,et al. Implementing the PGI Accelerator model , 2010, GPGPU-3.
[12] Yifeng Chen,et al. Large-scale FFT on GPU clusters , 2010, ICS '10.
[13] Rudolf Eigenmann,et al. OpenMP to GPGPU: a compiler framework for automatic translation and optimization , 2009, PPoPP '09.
[14] Rami G. Melhem,et al. Compilation Techniques for Optimizing Communication on Distributed-Memory Systems , 1993, 1993 International Conference on Parallel Processing - ICPP'93.
[15] Cédric Augonnet,et al. StarPU: a unified platform for task scheduling on heterogeneous multicore architectures , 2011, Concurr. Comput. Pract. Exp..
[16] Kevin Skadron,et al. Rodinia: A benchmark suite for heterogeneous computing , 2009, 2009 IEEE International Symposium on Workload Characterization (IISWC).
[17] Tarek S. Abdelrahman,et al. hiCUDA: a high-level directive-based language for GPU programming , 2009, GPGPU-2.
[18] François Bodin,et al. Heterogeneous multicore parallel programming for graphics processing units , 2009, Sci. Program..
[19] Bingsheng He,et al. Database compression on graphics processors , 2010, Proc. VLDB Endow..
[20] David I. August,et al. Automatic CPU-GPU communication management and optimization , 2011, PLDI '11.
[21] Corinne Ancourt,et al. A Linear Algebra Framework for Static High Performance Fortran Code Distribution , 1997, Sci. Program..
[22] Vivek Sarkar,et al. JCUDA: A Programmer-Friendly Interface for Accelerating Java Programs with CUDA , 2009, Euro-Par.