Synthesizing transformations for locality enhancement of imperfectly-nested loop nests
暂无分享,去创建一个
[1] Keshav Pingali,et al. Compiling Imperfectly-nested Sparse Matrix Codes with Dependences , 2000 .
[2] Monica S. Lam,et al. A data locality optimizing algorithm , 1991, PLDI '91.
[3] Fred G. Gustavson,et al. Recursion leads to automatic variable blocking for dense linear-algebra algorithms , 1997, IBM J. Res. Dev..
[4] Ken Kennedy,et al. Compiler blockability of numerical algorithms , 1992, Proceedings Supercomputing '92.
[5] Keshav Pingali,et al. Next-generation generic programming and its application to sparse matrix computations , 2000, ICS '00.
[6] Sharad Malik,et al. Cache miss equations: an analytical representation of cache misses , 1997, ICS '97.
[7] William Pugh,et al. Iteration Space Slicing for Locality , 1999, LCPC.
[8] Keshav Pingali,et al. A Framework for Sparse Matrix Code Synthesis from High-level Specifications , 2000, ACM/IEEE SC 2000 Conference (SC'00).
[9] Michael Wolfe,et al. High performance compilers for parallel computing , 1995 .
[10] S. Kung,et al. VLSI Array processors , 1985, IEEE ASSP Magazine.
[11] Yves Robert,et al. (Pen)-ultimate tiling? , 1994, Integr..
[12] Monica S. Lam,et al. The cache performance and optimizations of blocked algorithms , 1991, ASPLOS IV.
[13] J. Ramanujam,et al. Tiling Multidimensional Itertion Spaces for Multicomputers , 1992, J. Parallel Distributed Comput..
[14] William Pugh,et al. Finding Legal Reordering Transformations Using Mappings , 1994, LCPC.
[15] Corinne Ancourt,et al. Scanning polyhedra with DO loops , 1991, PPOPP '91.
[16] Keshav Pingali,et al. Data-centric multi-level blocking , 1997, PLDI '97.
[17] Ken Kennedy,et al. Transforming loops to recursion for multi-level memory hierarchies , 2000, PLDI '00.
[18] Jordi Torres,et al. Partitioning the statement per iteration space using non-singular matrices , 1993, ICS '93.
[19] Kathryn S. McKinley,et al. Tile size selection using cache organization and data layout , 1995, PLDI '95.
[20] Ken Kennedy,et al. Optimizing for parallelism and data locality , 1992, ICS '92.
[21] Keshav Pingali,et al. Access normalization: loop restructuring for NUMA computers , 1993, TOCS.
[22] Monica S. Lam,et al. Maximizing Parallelism and Minimizing Synchronization with Affine Partitions , 1998, Parallel Comput..
[23] William Pugh,et al. Selecting Affine Mappings Based on Performance Estimation , 1994, Parallel Process. Lett..
[24] Keshav Pingali,et al. Left-Looking to Right-Looking and Vice Versa: An Application of Fractal Symbolic Analysis to Linear Algebra Code Restructuring , 2000, Euro-Par.
[25] Zhiyuan Li,et al. New tiling techniques to improve cache temporal locality , 1999, PLDI '99.
[26] Utpal Banerjee,et al. A theory of loop permutations , 1990 .
[27] Keshav Pingali,et al. An experimental evaluation of tiling and shackling for memory hierarchy management , 1999, ICS '99.
[28] Gene H. Golub,et al. Matrix computations , 1983 .
[29] Jack Dongarra,et al. Automatic Blocking of Nested Loops , 1990 .
[30] William Pugh,et al. Counting solutions to Presburger formulas: how and why , 1994, PLDI '94.
[31] Keshav Pingali,et al. Access normalization: loop restructuring for NUMA compilers , 1992, ASPLOS V.
[32] Keshav Pingali,et al. Automatic Generation of Block-Recursive Codes , 2000, Euro-Par.