A Loop Transformation Algorithm for Communication Overlapping
暂无分享,去创建一个
[1] Rice UniversityCORPORATE,et al. High performance Fortran language specification , 1993 .
[2] Ken Kennedy,et al. Compiling programs for distributed-memory multiprocessors , 2004, The Journal of Supercomputing.
[3] Monica S. Lam,et al. Maximizing parallelism and minimizing synchronization with affine transforms , 1997, POPL '97.
[4] Marc Snir,et al. The Communication Software and Parallel Environment of the IBM SP2 , 1995, IBM Syst. J..
[5] Kazuaki Ishizaki,et al. A Loop Parallelization Algorithm for HPF Compilers , 1995, LCPC.
[6] Charles Koelbel,et al. Supporting shared data structures on distributed memory architectures , 1990, PPOPP '90.
[7] Michael Wolfe,et al. High performance compilers for parallel computing , 1995 .
[8] K. Kennedy,et al. Preliminary experiences with the Fortran D compiler , 1993, Supercomputing '93.
[9] Ken Kennedy,et al. GIVE-N-TAKE—a balanced code placement framework , 1994, PLDI '94.
[10] Geoffrey C. Fox,et al. Fortran 90D/HPF compiler for distributed memory MIMD computers: design, implementation, and performance results , 1993, Supercomputing '93.
[11] Hidetoshi Iwashita,et al. HPF compiler for the AP1000 , 1995, ICS '95.
[12] Michael Philippsen,et al. Automatic alignment of array data and processes to reduce communication time on DMPPs , 1995, PPOPP '95.
[13] Prithviraj Banerjee,et al. Techniques to overlap computation and communication in irregular iterative applications , 1994, ICS '94.
[14] Monica S. Lam,et al. Data and computation transformations for multiprocessors , 1995, PPOPP '95.
[15] Monica S. Lam,et al. A data locality optimizing algorithm , 1991, PLDI '91.
[16] Toshio Nakatani,et al. Detection and global optimization of reduction operations for distributed parallel machines , 1996, ICS '96.
[17] Michael Wolfe,et al. More iteration space tiling , 1989, Proceedings of the 1989 ACM/IEEE Conference on Supercomputing (Supercomputing '89).
[18] Tilak Agerwala,et al. SP2 System Architecture , 1999, IBM Syst. J..
[19] Kenichi Hayashi,et al. Improving AP1000 parallel computer performance with message communication , 1993, ISCA '93.
[20] Ken Kennedy,et al. Compiling Fortran D for MIMD distributed-memory machines , 1992, CACM.
[21] Anne Rogers,et al. Process decomposition through locality of reference , 1989, PLDI '89.
[22] John A. Chandy,et al. Communication Optimizations Used in the Paradigm Compiler for Distributed-Memory Multicomputers , 1994, 1994 Internatonal Conference on Parallel Processing Vol. 2.
[23] Monica S. Lam,et al. A Loop Transformation Theory and an Algorithm to Maximize Parallelism , 1991, IEEE Trans. Parallel Distributed Syst..
[24] T. von Eicken,et al. Parallel programming in Split-C , 1993, Supercomputing '93.
[25] Hiroshi Ohta,et al. Optimal tile size adjustment in compiling general DOACROSS loop nests , 1995, ICS '95.
[26] Chau-Wen Tseng. An optimizing Fortran D compiler for MIMD distributed-memory machines , 1993 .
[27] Monica S. Lam,et al. The SUIF Compiler System: a Parallelizing and Optimizing Research Compiler , 1994 .
[28] Michael Gerndt,et al. SUPERB: A tool for semi-automatic MIMD/SIMD parallelization , 1988, Parallel Comput..