Porting irregular reductions on heterogeneous CPU-GPU configurations
暂无分享,去创建一个
[1] Yunheung Paek,et al. Parallel Programming with Polaris , 1996, Computer.
[2] Geoffrey C. Fox,et al. Runtime Support and Compilation Methods for User-Specified Irregular Data Distributions , 1995, IEEE Trans. Parallel Distributed Syst..
[3] K. Kennedy,et al. Index Array Flattening Through Program Transformation , 1995, Proceedings of the IEEE/ACM SC95 Conference.
[4] Metin Nafi Gürcan,et al. Coordinating the use of GPU and CPU for improving performance of compute intensive applications , 2009, 2009 IEEE International Conference on Cluster Computing and Workshops.
[5] Ken Kennedy,et al. Improving memory hierarchy performance for irregular applications , 1999, ICS '99.
[6] Scott M. Murman,et al. Performance of a new CFD flow solver using a hybrid programming paradigm , 2005, J. Parallel Distributed Comput..
[7] Prithviraj Banerjee,et al. Exploiting spatial regularity in irregular iterative applications , 1995, Proceedings of 9th International Parallel Processing Symposium.
[8] Joel H. Saltz,et al. Interprocedural data flow based optimizations for distributed memory compilation , 1997 .
[9] Charles Koelbel,et al. Compiling Global Name-Space Parallel Loops for Distributed Execution , 1991, IEEE Trans. Parallel Distributed Syst..
[10] Andrew B. White,et al. Trailblazing with Roadrunner , 2009, Computing in Science & Engineering.
[11] Monica S. Lam,et al. Maximizing Multiprocessor Performance with the SUIF Compiler , 1996, Digit. Tech. J..
[12] Hyesoon Kim,et al. Qilin: Exploiting parallelism on heterogeneous multiprocessors with adaptive mapping , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[13] Joel H. Saltz,et al. ICASE Report No . 92-12 / iVG / / ff 3 J / ICASE THE DESIGN AND IMPLEMENTATION OF A PARALLEL UNSTRUCTURED EULER SOLVER USING SOFTWARE PRIMITIVES , 2022 .
[14] von Hanxledenreinhard. D Newsletter #9 Handling Irregular Problems with Fortran D | a Preliminary Report Handling Irregular Problems with Fortran D | a Preliminary Report , 1993 .
[15] Amar Shan,et al. Heterogeneous processing: a strategy for augmenting moore's law , 2006 .
[16] Larry Carter,et al. Localizing non-affine array references , 1999, 1999 International Conference on Parallel Architectures and Compilation Techniques (Cat. No.PR00425).
[17] Harry Berryman,et al. Distributed Memory Compiler Design for Sparse Problems , 1995, IEEE Trans. Computers.
[18] Joel H. Saltz,et al. Parallelizing Molecular Dynamics Programs for Distributed Memory Machines: An Application of the Cha , 1994 .
[19] Gregory Diamos,et al. Harmony: an execution model and runtime for heterogeneous many core systems , 2008, HPDC '08.
[20] Chau-Wen Tseng,et al. A Comparison of Locality Transformations for Irregular Codes , 2000, LCR.
[21] Gagan Agrawal,et al. Compiler and runtime support for enabling generalized reduction computations on heterogeneous parallel configurations , 2010, ICS '10.
[22] Hasan U. Akay,et al. Dynamic Load-Balancing for Distributed Heterogeneous Computing of Parallel CFD Problems , 2000 .
[23] Chau-Wen Tseng,et al. Improving Compiler and Run-Time Support for Irregular Reductions Using Local Writes , 1998, LCPC.
[24] James R. Larus,et al. Efficient support for irregular applications on distributed-memory machines , 1995, PPOPP '95.
[25] David A. Padua,et al. On the Automatic Parallelization of Sparse and Irregular Fortran Programs , 1998, LCR.
[26] Gagan Agrawal,et al. An execution strategy and optimized runtime support for parallelizing irregular reductions on modern GPUs , 2011, ICS '11.
[27] Michael Garland. Sparse matrix computations on manycore GPU’s , 2008, 2008 45th ACM/IEEE Design Automation Conference.
[28] Dimitri J. Mavriplis,et al. The design and implementation of a parallel unstructured Euler solver using software primitives , 1992 .
[29] Emilio L. Zapata,et al. A compiler method for the parallel execution of irregular reductions in scalable shared memory multiprocessors , 2000, ICS '00.
[30] Surendra Byna,et al. Data-aware scheduling of legacy kernels on heterogeneous platforms with distributed memory , 2010, SPAA '10.