Non-affine Extensions to Polyhedral Code Generation
暂无分享,去创建一个
Mary W. Hall | Anand Venkat | Michelle Mills Strout | Manu Shantharam | M. Strout | Anand Venkat | Manu Shantharam
[1] Albert Cohen,et al. Polyhedral Code Generation in the Real World , 2006, CC.
[2] Geri Georg,et al. Set and Relation Manipulation for the Sparse Polyhedral Framework , 2012, LCPC.
[3] Jacqueline Chame,et al. A script-based autotuning compiler system to generate high-performance CUDA code , 2013, TACO.
[4] Ken Kennedy,et al. Improving Memory Hierarchy Performance for Irregular Applications Using Data and Computation Reorderings , 2001, International Journal of Parallel Programming.
[5] Ken Kennedy,et al. Optimizing Compilers for Modern Architectures: A Dependence-based Approach , 2001 .
[6] Larry Carter,et al. An approach for code generation in the Sparse Polyhedral Framework , 2016, Parallel Comput..
[7] Harry A. G. Wijshoff,et al. Sublimation: Expanding Data Structures to Enable Data Instance Specific Optimizations , 2010, LCPC.
[8] Joel H. Saltz,et al. Programming Irregular Applications: Runtime Support, Compilation and Tools , 1997, Adv. Comput..
[9] Rudolf Eigenmann,et al. The range test: a dependence test for symbolic, non-linear expressions , 1994, Proceedings of Supercomputing '94.
[10] Paul Feautrier,et al. Fuzzy Array Dataflow Analysis , 1997, J. Parallel Distributed Comput..
[11] Rudolf Eigenmann,et al. Optimizing irregular shared-memory applications for distributed-memory systems , 2006, PPoPP '06.
[12] Timothy A. Davis,et al. The university of Florida sparse matrix collection , 2011, TOMS.
[13] Corinne Ancourt,et al. Scanning polyhedra with DO loops , 1991, PPOPP '91.
[14] Monica S. Lam,et al. Interprocedural parallelization analysis in SUIF , 2005, TOPL.
[15] Samuel Williams,et al. Optimization of sparse matrix-vector multiplication on emerging multicore platforms , 2009, Parallel Comput..
[16] Chun Chen,et al. Polyhedra scanning revisited , 2012, PLDI.
[17] Albert Cohen,et al. The Polyhedral Model Is More Widely Applicable Than You Think , 2010, CC.
[18] Paul Feautrier,et al. Automatic Parallelization in the Polytope Model , 1996, The Data Parallel Programming Model.
[19] Francky Catthoor,et al. Polyhedral parallel code generation for CUDA , 2013, TACO.
[20] Joel H. Saltz,et al. Principles of runtime support for parallel processors , 1988, ICS '88.
[21] Ken Kennedy,et al. Improving cache performance in dynamic applications through data and computation reorganization at run time , 1999, PLDI '99.
[22] William Pugh,et al. Constraint-based array dependence analysis , 1998, TOPL.
[23] William Pugh,et al. Optimization within a unified transformation framework , 1996 .
[24] Chau-Wen Tseng,et al. Exploiting locality for irregular scientific codes , 2006, IEEE Transactions on Parallel and Distributed Systems.
[25] Larry Carter,et al. Compile-time composition of run-time data and iteration reorderings , 2003, PLDI '03.
[26] Sanjay V. Rajopadhye,et al. Generation of Efficient Nested Loops from Polyhedra , 2000, International Journal of Parallel Programming.
[27] David A. Padua,et al. Compiler analysis of irregular memory accesses , 2000, PLDI '00.
[28] William Pugh,et al. Nonlinear array dependence analysis , 1994 .
[29] Larry Carter,et al. Localizing non-affine array references , 1999, 1999 International Conference on Parallel Architectures and Compilation Techniques (Cat. No.PR00425).
[30] Bo Wu,et al. Complexity analysis and algorithm design for reorganizing data to minimize non-coalesced memory accesses on GPU , 2013, PPoPP '13.
[31] Michael Garland,et al. Implementing sparse matrix-vector multiplication on throughput-oriented processors , 2009, Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis.
[32] L. Rauchwerger,et al. The LRPD Test: Speculative Run-Time Parallelization of Loops with Privatization and Reduction Parallelization , 1999, IEEE Trans. Parallel Distributed Syst..
[33] J. Ramanujam,et al. Code generation for parallel execution of a class of irregular loops on distributed memory systems , 2012, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis.
[34] Lawrence Rauchwerger,et al. Hybrid Analysis: Static & Dynamic Memory Reference Analysis , 2004, International Journal of Parallel Programming.
[35] Rudolf Eigenmann,et al. Idiom recognition in the Polaris parallelizing compiler , 1995, ICS '95.
[36] Katherine Yelick,et al. OSKI: A library of automatically tuned sparse matrix kernels , 2005 .
[37] Sven Verdoolaege,et al. isl: An Integer Set Library for the Polyhedral Model , 2010, ICMS.