A Comparison of Compiler Tiling Algorithms
暂无分享,去创建一个
[1] Vivek Sarkar,et al. On Estimating and Enhancing Cache Effectiveness , 1991, LCPC.
[2] Ken Kennedy,et al. Compiler blockability of numerical algorithms , 1992, Proceedings Supercomputing '92.
[3] Sharad Malik,et al. Cache miss equations: an analytical representation of cache misses , 1997, ICS '97.
[4] Olivier Temam,et al. To copy or not to copy: A compile-time technique for assessing when data copying should be used to eliminate cache conflicts , 1993, Supercomputing '93. Proceedings.
[5] David A. Wood,et al. Cache profiling and the SPEC benchmarks: a case study , 1994, Computer.
[6] Monica S. Lam,et al. The cache performance and optimizations of blocked algorithms , 1991, ASPLOS IV.
[7] Karim Esseghir. Improving data locality for caches , 1993 .
[8] David H. Bailey. Unfavorable Strides in Cache Memory Systems (RNR Technical Report RNR-92-015) , 1995, Sci. Program..
[9] David H. Bailey. Unfavorable strides in cache memory systems , 1992 .
[10] Chau-Wen Tseng,et al. Improving data locality with loop transformations , 1996, TOPL.
[11] Ken Kennedy,et al. Improving register allocation for subscripted variables , 1990, PLDI '90.
[12] Vivek Sarkar,et al. A compiler framework for restructuring data declarations to enhance cache and TLB effectiveness , 1994, CASCON.
[13] Michael Wolfe,et al. More iteration space tiling , 1989, Proceedings of the 1989 ACM/IEEE Conference on Supercomputing (Supercomputing '89).
[14] Olivier Temam,et al. Cache interference phenomena , 1994, SIGMETRICS.
[15] D LamMonica,et al. The cache performance and optimizations of blocked algorithms , 1991 .
[16] Chau-Wen Tseng,et al. Data transformations for eliminating conflict misses , 1998, PLDI.
[17] François Irigoin,et al. Supernode partitioning , 1988, POPL '88.
[18] Chau-Wen Tseng,et al. Eliminating conflict misses for high performance architectures , 1998, ICS '98.
[19] Mahmut T. Kandemir,et al. A compiler algorithm for optimizing locality in loop nests , 1997, ICS '97.
[20] Monica S. Lam,et al. A data locality optimizing algorithm , 1991, PLDI '91.
[21] Olivier Temam,et al. A quantitative analysis of loop nest locality , 1996, ASPLOS VII.
[22] Tarek S. Abdelrahman,et al. Fusion of Loops for Parallelism and Locality , 1997, IEEE Trans. Parallel Distributed Syst..
[23] Kathryn S. McKinley,et al. Tile size selection using cache organization and data layout , 1995, PLDI '95.
[24] W. Jalby,et al. To copy or not to copy: a compile-time technique for assessing when data copying should be used to eliminate cache conflicts , 1993, Supercomputing '93.
[25] Wei Li,et al. Unifying data and control transformations for distributed shared-memory machines , 1995, PLDI '95.
[26] Michael F. P. O'Boyle,et al. Non-singular data transformations: definition, validity and applications , 1997, ICS '97.