Performance Enhancement by Memory Reduction
暂无分享,去创建一个
Yonghong Song | Rong Xu | Cheng Wang | Zhiyuan Li | Zhiyuan Li | Yonghong Song | Rong-Chang Xu | Cheng Wang
[1] Yonghong Song,et al. Compiler algorithms for efficient use of memory systems , 2000 .
[2] Larry Carter,et al. Schedule-independent storage mapping for loops , 1998, ASPLOS VIII.
[3] Michael Wolfe,et al. High performance compilers for parallel computing , 1995 .
[4] Thomas R. Gross,et al. Structured dataflow analysis for arrays and its use in an optimizing compiler , 1990, Softw. Pract. Exp..
[5] Vivek Sarkar,et al. Optimization of array accesses by collective loop transformations , 1991, ICS '91.
[6] John R. Rice,et al. Problems to Test Parallel and Vector Languages -- II , 1990 .
[7] Hanif D. Sherali,et al. Linear Programming and Network Flows , 1977 .
[8] François Irigoin,et al. Interprocedural Array Region Analyses , 1996, International Journal of Parallel Programming.
[9] Michael E. Wolf,et al. Improving locality and parallelism in nested loops , 1992 .
[10] Zhiyuan Li,et al. Experience with efficient array data flow analysis for array privatization , 1997, PPOPP '97.
[11] Monica S. Lam,et al. Array-data flow analysis and its use in array privatization , 1993, POPL '93.
[12] Vivek Sarkar,et al. Optimal weighted loop fusion for parallel programs , 1997, SPAA '97.
[13] Lawrence Snyder,et al. The implementation and evaluation of fusion and contraction in array languages , 1998, PLDI '98.
[14] Zhiyuan Li,et al. New tiling techniques to improve cache temporal locality , 1999, PLDI '99.
[15] Geoffrey C. Fox,et al. Applications Benchmark Set for Fortran-D and High Performance Fortran , 1992 .
[16] Tarek S. Abdelrahman,et al. Fusion of Loops for Parallelism and Locality , 1997, IEEE Trans. Parallel Distributed Syst..
[17] Chi-Chung Lam,et al. Optimization of Memory Usage and Communication Requirements for a Class of Loops Implementing Multi-Dimensiona l Integrals , 1999 .
[18] Anne Mignotte,et al. Loop alignment for memory accesses optimization , 1999, Proceedings 12th International Symposium on System Synthesis.
[19] Ken Kennedy,et al. Maximizing Loop Parallelism and Improving Data Locality via Loop Fusion and Distribution , 1993, LCPC.
[20] Vivek Sarkar,et al. On Estimating and Enhancing Cache Effectiveness , 1991, LCPC.
[21] V. Sarkar,et al. Collective Loop Fusion for Array Contraction , 1992, LCPC.
[22] Ken Kennedy,et al. Improving register allocation for subscripted variables , 1990, PLDI '90.
[23] Kathryn S. McKinley,et al. A Parametrized Loop Fusion Algorithm for Improving Parallelism and Cache Locality , 1997, Comput. J..
[24] Alexander Schrijver,et al. Theory of linear and integer programming , 1986, Wiley-Interscience series in discrete mathematics and optimization.