Complexity of Multi-dimensional Loop Alignment
暂无分享,去创建一个
[1] Frédéric Vivien,et al. Combining Retiming and Scheduling Techniques for Loop Parallelization and Loop Tiling , 1997, Parallel Process. Lett..
[2] Ken Kennedy,et al. Loop distribution with arbitrary control flow , 1990, Proceedings SUPERCOMPUTING '90.
[3] W. Kelly,et al. Code generation for multiple mappings , 1995, Proceedings Frontiers '95. The Fifth Symposium on the Frontiers of Massively Parallel Computation.
[4] Ken Kennedy,et al. Maximizing Loop Parallelism and Improving Data Locality via Loop Fusion and Distribution , 1993, LCPC.
[5] Ken Kennedy,et al. Automatic decomposition of scientific programs for parallel execution , 1987, POPL '87.
[6] Leslie Lamport,et al. The parallel execution of DO loops , 1974, CACM.
[7] Monica S. Lam,et al. A Loop Transformation Theory and an Algorithm to Maximize Parallelism , 1991, IEEE Trans. Parallel Distributed Syst..
[8] J K Peir. Program partitioning and synchronization on multiprocessor systems , 1986 .
[9] Barbara M. Chapman,et al. Supercompilers for parallel and vector computers , 1990, ACM Press frontier series.
[10] Michael Wolfe,et al. High performance compilers for parallel computing , 1995 .
[11] Yves Robert,et al. Scheduling and Automatic Parallelization , 2000, Birkhäuser Boston.
[12] Pierre Boulet,et al. Loop Parallelization Algorithms: From Parallelism Extraction to Code Generation , 1998, Parallel Comput..
[13] William Pugh,et al. Selecting Affine Mappings Based on Performance Estimation , 1994, Parallel Process. Lett..
[14] Jih-Kwon Peir,et al. Minimum Distance: A Method for Partitioning Recurrences for Multiprocessors , 1989, IEEE Trans. Computers.
[15] FeautrierPaul. Some efficient solutions to the affine scheduling problem , 1992 .
[16] Sanjay V. Rajopadhye,et al. Generation of Efficient Nested Loops from Polyhedra , 2000, International Journal of Parallel Programming.
[17] Alain Darte,et al. Loop Shifting for Loop Compaction , 1999, LCPC.
[18] Alain Darte,et al. On the complexity of loop fusion , 1999, 1999 International Conference on Parallel Architectures and Compilation Techniques (Cat. No.PR00425).
[19] Kunio Okuda,et al. Cycle Shrinking by Dependence Reduction , 1996, Euro-Par, Vol. I.
[20] Monica S. Lam,et al. Maximizing parallelism and minimizing synchronization with affine transforms , 1997, POPL '97.
[21] David S. Johnson,et al. Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .
[22] Scott A. Mahlke,et al. High-level synthesis of nonprogrammable hardware accelerators , 2000, Proceedings IEEE International Conference on Application-Specific Systems, Architectures, and Processors.
[23] Paul Feautrier,et al. Construction of Do Loops from Systems of Affine Constraints , 1995, Parallel Process. Lett..
[24] Edwin Hsing-Mean Sha,et al. Polynomial-time nested loop fusion with full parallelism , 1996, Proceedings of the 1996 ICPP Workshop on Challenges for Parallel Processing.