Communication-Minimal Tiling of Uniform Dependence Loops
暂无分享,去创建一个
[1] Corinne Ancourt,et al. Minimal Data Dependence Abstractions for Loop Transformations , 1994, LCPC.
[2] Michael Wolfe,et al. High performance compilers for parallel computing , 1995 .
[3] Vivek Sarkar,et al. Locality Analysis for Distributed Shared-Memory Multiprocessors , 1996, LCPC.
[4] Hiroshi Ohta,et al. Optimal tile size adjustment in compiling general DOACROSS loop nests , 1995, ICS '95.
[5] William Jalby,et al. Impact of Hierarchical Memory Systems On Linear Algebra Algorithm Design , 1988 .
[6] Chung-Ta King,et al. Grouping in Nested Loops for Parallel Execution on Multicomputers , 1989, International Conference on Parallel Processing.
[7] D. Sorensen,et al. Block reduction of matrices to condensed forms for eigenvalue computations , 1990 .
[8] François Irigoin,et al. Supernode partitioning , 1988, POPL '88.
[9] Monica S. Lam,et al. A data locality optimizing algorithm , 1991, PLDI '91.
[10] Monica S. Lam,et al. A Loop Transformation Theory and an Algorithm to Maximize Parallelism , 1991, IEEE Trans. Parallel Distributed Syst..
[11] Ken Kennedy,et al. Compiler blockability of numerical algorithms , 1992, Proceedings Supercomputing '92.
[12] Gene H. Golub,et al. Matrix computations , 1983 .
[13] Jack Dongarra,et al. Automatic Blocking of Nested Loops , 1990 .
[14] Yves Robert,et al. (Pen)-ultimate tiling? , 1994, Integr..
[15] Monica S. Lam,et al. The cache performance and optimizations of blocked algorithms , 1991, ASPLOS IV.
[16] Jingling Xue,et al. On Tiling as a Loop Transformation , 1997, Parallel Process. Lett..
[17] Anne Rogers,et al. Compiling for Distributed Memory Architectures , 1994, IEEE Trans. Parallel Distributed Syst..
[18] D LamMonica,et al. The cache performance and optimizations of blocked algorithms , 1991 .
[19] David K. Smith. Theory of Linear and Integer Programming , 1987 .
[20] Utpal Banerjee. Loop Parallelization , 1994, Springer US.
[21] Michael Wolfe,et al. More iteration space tiling , 1989, Proceedings of the 1989 ACM/IEEE Conference on Supercomputing (Supercomputing '89).
[22] Michael Wolfe,et al. Iteration Space Tiling for Memory Hierarchies , 1987, PPSC.
[23] Ken Kennedy,et al. Cross-Loop Reuse Analysis and Its Application to Cache Optimizations , 1996, LCPC.