Improving cache Performance Through Tiling and Data Alignment
暂无分享,去创建一个
[1] Kathryn S. McKinley,et al. Tile size selection using cache organization and data layout , 1995, PLDI '95.
[2] F. A. Seiler,et al. Numerical Recipes in C: The Art of Scientific Computing , 1989 .
[3] Karim Esseghir. Improving data locality for caches , 1993 .
[4] William H. Press,et al. The Art of Scientific Computing Second Edition , 1998 .
[5] Olivier Temam,et al. To copy or not to copy: A compile-time technique for assessing when data copying should be used to eliminate cache conflicts , 1993, Supercomputing '93. Proceedings.
[6] Monica S. Lam,et al. The cache performance and optimizations of blocked algorithms , 1991, ASPLOS IV.
[7] David A. Patterson,et al. Computer Organization & Design: The Hardware/Software Interface , 1993 .
[8] David A. Wood,et al. Cache profiling and the SPEC benchmarks: a case study , 1994, Computer.
[9] Ken Kennedy,et al. Software prefetching , 1991, ASPLOS IV.
[10] Tomás Lang,et al. MOB forms: a class of multilevel block algorithms for dense linear algebra operations , 1994, ICS '94.
[11] William H. Press,et al. Numerical recipes in C. The art of scientific computing , 1987 .
[12] Paul M. Embree,et al. C Language Algorithms for Digital Signal Processing , 1991 .
[13] Monica S. Lam,et al. A data locality optimizing algorithm , 1991, PLDI '91.
[14] W. Jalby,et al. To copy or not to copy: a compile-time technique for assessing when data copying should be used to eliminate cache conflicts , 1993, Supercomputing '93.
[15] Ken Kennedy,et al. Compiler blockability of numerical algorithms , 1992, Proceedings Supercomputing '92.
[16] Michael Wolfe,et al. More iteration space tiling , 1989, Proceedings of the 1989 ACM/IEEE Conference on Supercomputing (Supercomputing '89).