Removing Impediments to Loop Fusion Through Code Transformations
暂无分享,去创建一个
José Nelson Amaral | Christopher Barton | Bob Blainey | J. N. Amaral | Christopher Barton | Bob Blainey
[1] Yoichi Muraoka,et al. Parallelism exposure and exploitation in programs , 1971 .
[2] David J. Kuck,et al. A Survey of Parallel Machine Organization and Programming , 1977, CSUR.
[3] Monica S. Lam,et al. Blocking and array contraction across arbitrarily nested loops using affine partitioning , 2001, PPoPP '01.
[4] V. Sarkar,et al. Collective Loop Fusion for Array Contraction , 1992, LCPC.
[5] Michael Hind,et al. Loop distribution with multiple exits , 1992, Proceedings Supercomputing '92.
[6] Kathryn S. McKinley,et al. A Parametrized Loop Fusion Algorithm for Improving Parallelism and Cache Locality , 1997, Comput. J..
[7] Ken Kennedy,et al. Loop distribution with arbitrary control flow , 1990, Proceedings SUPERCOMPUTING '90.
[8] Rajiv Gupta,et al. Adaptive loop transformations for scientific programs , 1995, Proceedings.Seventh IEEE Symposium on Parallel and Distributed Processing.
[9] Ken Kennedy,et al. The memory of bandwidth bottleneck and its amelioration by a compiler , 2000, Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000.
[10] Vivek Sarkar,et al. Optimal weighted loop fusion for parallel programs , 1997, SPAA '97.
[11] David F. Bacon,et al. Compiler transformations for high-performance computing , 1994, CSUR.
[12] Ken Kennedy,et al. Typed Fusion with Applications to Parallel and Sequential Code Generation , 1994 .
[13] Ken Kennedy,et al. Improving effective bandwidth through compiler enhancement of global cache reuse , 2001, Proceedings 15th International Parallel and Distributed Processing Symposium. IPDPS 2001.
[14] Ken Kennedy,et al. Maximizing Loop Parallelism and Improving Data Locality via Loop Fusion and Distribution , 1993, LCPC.