Locality Optimizations for Multi-Level Caches
暂无分享,去创建一个
[1] Vivek Sarkar,et al. Automatic selection of high-order transformations in the IBM XL FORTRAN compilers , 1997, IBM J. Res. Dev..
[2] Chau-Wen Tseng,et al. Data transformations for eliminating conflict misses , 1998, PLDI.
[3] Tarek S. Abdelrahman,et al. Fusion of Loops for Parallelism and Locality , 1997, IEEE Trans. Parallel Distributed Syst..
[4] Vivek Sarkar,et al. On Estimating and Enhancing Cache Effectiveness , 1991, LCPC.
[5] Ken Kennedy,et al. Compiler blockability of numerical algorithms , 1992, Proceedings Supercomputing '92.
[6] V. Sarkar,et al. Collective Loop Fusion for Array Contraction , 1992, LCPC.
[7] Sharad Malik,et al. Cache miss equations: an analytical representation of cache misses , 1997, ICS '97.
[8] Kathryn S. McKinley,et al. Tile size selection using cache organization and data layout , 1995, PLDI '95.
[9] Chau-Wen Tseng,et al. Improving data locality with loop transformations , 1996, TOPL.
[10] Vivek Sarkar,et al. A compiler framework for restructuring data declarations to enhance cache and TLB effectiveness , 1994, CASCON.
[11] Michael Wolfe,et al. More iteration space tiling , 1989, Proceedings of the 1989 ACM/IEEE Conference on Supercomputing (Supercomputing '89).
[12] R. C. Whaley,et al. Automatically Tuned Linear Algebra Software (ATLAS) , 2011, Encyclopedia of Parallel Computing.
[13] Monica S. Lam,et al. A Loop Transformation Theory and an Algorithm to Maximize Parallelism , 1991, IEEE Trans. Parallel Distributed Syst..
[14] Mithuna Thottethodi,et al. Nonlinear array layouts for hierarchical memory systems , 1999, ICS '99.
[15] Chau-Wen Tseng,et al. Eliminating conflict misses for high performance architectures , 1998, ICS '98.
[16] Keshav Pingali,et al. An experimental evaluation of tiling and shackling for memory hierarchy management , 1999, ICS '99.
[17] Wei Li,et al. Unifying data and control transformations for distributed shared-memory machines , 1995, PLDI '95.
[18] Jacqueline Chame,et al. A tile selection algorithm for data locality and cache interference , 1999, ICS '99.
[19] Steven W. K. Tjiang,et al. SUIF: an infrastructure for research on parallelizing and optimizing compilers , 1994, SIGP.
[20] Chau-Wen Tseng,et al. A Comparison of Compiler Tiling Algorithms , 1999, CC.
[21] Mahmut T. Kandemir,et al. Improving locality using loop and data transformations in an integrated framework , 1998, Proceedings. 31st Annual ACM/IEEE International Symposium on Microarchitecture.
[22] Ken Kennedy,et al. Maximizing Loop Parallelism and Improving Data Locality via Loop Fusion and Distribution , 1993, LCPC.
[23] Zhiyuan Li,et al. New tiling techniques to improve cache temporal locality , 1999, PLDI '99.
[24] Olivier Temam,et al. Cache interference phenomena , 1994, SIGMETRICS.
[25] Kathryn S. McKinley,et al. A Parametrized Loop Fusion Algorithm for Improving Parallelism and Cache Locality , 1997, Comput. J..
[26] Monica S. Lam,et al. A data locality optimizing algorithm , 1991, PLDI '91.
[27] Monica S. Lam,et al. The cache performance and optimizations of blocked algorithms , 1991, ASPLOS IV.
[28] Mahmut T. Kandemir,et al. A compiler algorithm for optimizing locality in loop nests , 1997, ICS '97.
[29] François Irigoin,et al. Supernode partitioning , 1988, POPL '88.