Towards effective automatic parallelization for multicore systems
暂无分享,去创建一个
Uday Bondhugula | Sriram Krishnamoorthy | J. Ramanujam | P. Sadayappan | Atanas Rountev | Muthu Manikandan Baskaran | Albert Hartono
[1] William Pugh,et al. The Omega test: A fast and practical integer programming algorithm for dependence analysis , 1991, Proceedings of the 1991 ACM/IEEE Conference on Supercomputing (Supercomputing '91).
[2] Kathryn S. McKinley,et al. Tile size selection using cache organization and data layout , 1995, PLDI '95.
[3] FeautrierPaul. Some efficient solutions to the affine scheduling problem , 1992 .
[4] S. Krishnamoorthy,et al. Affine Transformations for Communication Minimal Parallelization and Locality Optimization of Arbitrarily Nested Loop Sequences , 2007 .
[5] Yves Robert,et al. (Pen)-ultimate tiling? , 1994, Integr..
[6] Zhiyuan Li,et al. New tiling techniques to improve cache temporal locality , 1999, PLDI '99.
[7] Keshav Pingali,et al. Synthesizing transformations for locality enhancement of imperfectly-nested loop nests , 2000 .
[8] Sanjay V. Rajopadhye,et al. A Geometric Programming Framework for Optimal Multi-Level Tiling , 2004, Proceedings of the ACM/IEEE SC2004 Conference.
[9] P. Feautrier. Parametric integer programming , 1988 .
[10] J. Ramanujam,et al. Tiling Multidimensional Itertion Spaces for Multicomputers , 1992, J. Parallel Distributed Comput..
[11] Sanjay V. Rajopadhye,et al. Generation of Efficient Nested Loops from Polyhedra , 2000, International Journal of Parallel Programming.
[12] Frédéric Vivien,et al. Optimal Fine and Medium Grain Parallelism Detection in Polyhedral Reduced Dependence Graphs , 2004, International Journal of Parallel Programming.
[13] W. Kelly,et al. Code generation for multiple mappings , 1995, Proceedings Frontiers '95. The Fifth Symposium on the Frontiers of Massively Parallel Computation.
[14] François Irigoin,et al. Supernode partitioning , 1988, POPL '88.
[15] Jack Dongarra,et al. Automatic Blocking of Nested Loops , 1990 .
[16] Monica S. Lam,et al. A data locality optimizing algorithm , 1991, PLDI '91.
[17] Cédric Bastoul,et al. Code generation in the polyhedral model is easier than you think , 2004, Proceedings. 13th International Conference on Parallel Architecture and Compilation Techniques, 2004. PACT 2004..
[18] Uday Bondhugula,et al. Automatic data movement and computation mapping for multi-level parallel architectures with explicitly managed memories , 2008, PPoPP.
[19] Albert Cohen,et al. Iterative Optimization in the Polyhedral Model: Part I, One-Dimensional Time , 2007, International Symposium on Code Generation and Optimization (CGO'07).
[20] Keshav Pingali,et al. Tiling Imperfectly-nested Loop Nests , 2000, ACM/IEEE SC 2000 Conference (SC'00).
[21] Sanjay V. Rajopadhye,et al. Parameterized tiled loops for free , 2007, PLDI '07.
[22] Albert Cohen,et al. Polyhedral Code Generation in the Real World , 2006, CC.
[23] Larry Carter,et al. Selecting tile shape for minimal execution time , 1999, SPAA '99.
[24] Paul Feautrier,et al. Dataflow analysis of array and scalar references , 1991, International Journal of Parallel Programming.
[25] Corinne Ancourt,et al. Scanning polyhedra with DO loops , 1991, PPOPP '91.
[26] Monica S. Lam,et al. An affine partitioning algorithm to maximize parallelism and minimize communication , 1999, ICS '99.
[27] Christian Lengauer,et al. Loop Parallelization in the Polytope Model , 1993, CONCUR.
[28] Paul Feautrier,et al. Some efficient solutions to the affine scheduling problem. Part II. Multidimensional time , 1992, International Journal of Parallel Programming.
[29] Monica S. Lam,et al. Blocking and array contraction across arbitrarily nested loops using affine partitioning , 2001, PPoPP '01.
[30] David Parello,et al. Semi-Automatic Composition of Loop Transformations for Deep Parallelism and Memory Hierarchies , 2006, International Journal of Parallel Programming.
[31] Sanjay V. Rajopadhye,et al. Multi-level tiling: M for the price of one , 2007, Proceedings of the 2007 ACM/IEEE Conference on Supercomputing (SC '07).
[32] Paul Feautrier,et al. Some efficient solutions to the affine scheduling problem. I. One-dimensional time , 1992, International Journal of Parallel Programming.
[33] Martin Griebl,et al. Code generation in the polytope model , 1998, Proceedings. 1998 International Conference on Parallel Architectures and Compilation Techniques (Cat. No.98EX192).
[34] Keshav Pingali,et al. Tiling Imperfectly-nested Loop Nests (REVISED) , 2000 .
[35] David Parello,et al. Facilitating the search for compositions of program transformations , 2005, ICS '05.
[36] Ken Kennedy,et al. Transforming Complex Loop Nests for Locality , 2004, The Journal of Supercomputing.
[37] Martin Griebl,et al. Automatic Parallelization of Loop Programs for Distributed Memory Architectures , 2004 .
[38] Monica S. Lam,et al. Maximizing Parallelism and Minimizing Synchronization with Affine Partitions , 1998, Parallel Comput..