Improving Performance by Reducing the Memory Footprint of Scientific Applications
暂无分享,去创建一个
[1] Monica S. Lam,et al. Blocking and array contraction across arbitrarily nested loops using affine partitioning , 2001, PPoPP '01.
[2] Chau-Wen Tseng,et al. Compiler optimizations for improving data locality , 1994, ASPLOS VI.
[3] Alfred V. Aho,et al. Compilers: Principles, Techniques, and Tools , 1986, Addison-Wesley series in computer science / World student series edition.
[4] David A. Patterson,et al. Computer Architecture: A Quantitative Approach , 1969 .
[5] Ken Kennedy,et al. Scalarizing Fortran 90 Array Syntax , 2001 .
[6] Paul N. Hilfinger,et al. Better Tiling and Array Contraction for Compiling Scientific Programs , 2002, ACM/IEEE SC 2002 Conference (SC'02).
[7] V. Sarkar,et al. Collective Loop Fusion for Array Contraction , 1992, LCPC.
[8] Robert J. Fowler,et al. Increasing Temporal Locality with Skewing and Recursive Blocking , 2001, International Conference on Software Composition.
[9] Michael Wolfe,et al. High performance compilers for parallel computing , 1995 .
[10] Ken Kennedy,et al. Improving the ratio of memory operations to floating-point operations in loops , 1994, TOPL.
[11] Ken Kennedy,et al. Improving cache performance in dynamic applications through data and computation reorganization at run time , 1999, PLDI '99.
[12] Apan Qasem,et al. Improving Performance with Integrated Program Transformations , 2004 .
[13] Monica S. Lam,et al. A data locality optimizing algorithm , 1991, PLDI '91.
[14] Allen,et al. Optimizing Compilers for Modern Architectures , 2004 .
[15] Anne Mignotte,et al. Loop alignment for memory accesses optimization , 1999, Proceedings 12th International Symposium on System Synthesis.
[16] Lawrence Snyder,et al. The implementation and evaluation of fusion and contraction in array languages , 1998, PLDI '98.
[17] Larry Carter,et al. Quantifying the Multi-Level Nature of Tiling Interactions , 1997, International Journal of Parallel Programming.
[18] Monica S. Lam,et al. Cache Optimizations With Affine Partitioning , 2001, PP.
[19] Ken Kennedy,et al. Improving effective bandwidth through compiler enhancement of global cache reuse , 2001, Proceedings 15th International Parallel and Distributed Processing Symposium. IPDPS 2001.
[20] Cheng Wang,et al. Data locality enhancement by memory reduction , 2001, ICS '01.
[21] Ken Kennedy,et al. Improving register allocation for subscripted variables , 1990, SIGP.
[22] Keith D. Cooper,et al. Engineering a Compiler , 2003 .
[23] William Pugh,et al. An Exact Method for Analysis of Value-based Array Data Dependences , 1993, LCPC.
[24] William Pugh,et al. The Omega Library interface guide , 1995 .
[25] Kathryn S. McKinley,et al. Tile size selection using cache organization and data layout , 1995, PLDI '95.