A compiler framework for extracting superword level parallelism
暂无分享,去创建一个
[1] Rainer Leupers,et al. A SIMD optimization framework for retargetable compilers , 2009, TACO.
[2] Ayal Zaks,et al. Auto-vectorization of interleaved data for SIMD , 2006, PLDI '06.
[3] Francky Catthoor,et al. Pack Transposition: Enhancing Superword Level Parallelism Exploitation , 2005, PARCO.
[4] Saman P. Amarasinghe,et al. Exploiting superword level parallelism with multimedia instruction sets , 2000, PLDI '00.
[5] Peng Wu,et al. Vectorization for SIMD architectures with alignment constraints , 2004, PLDI '04.
[6] Jaewook Shin,et al. Compiler-controlled caching in superword register files for multimedia extension architectures , 2002, Proceedings.International Conference on Parallel Architectures and Compilation Techniques.
[7] Rainer Leupers,et al. A uniform optimization technique for offset assignment problems , 1998, Proceedings. 11th International Symposium on System Synthesis (Cat. No.98EX210).
[8] Mary Hall,et al. Compiler optimizations for architectures supporting superword-level parallelism , 2005 .
[9] Gang Ren,et al. Optimizing data permutations for SIMD devices , 2006, PLDI '06.
[10] Vivek Sarkar,et al. Efficient Selection of Vector Instructions Using Dynamic Programming , 2010, 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture.
[11] Ken Kennedy,et al. Relaxing SIMD control flow constraints using loop transformations , 1992, PLDI '92.
[12] Andreas Krall,et al. Compilation Techniques for Multimedia Processors , 2004, International Journal of Parallel Programming.
[13] Derek J. DeVries. A vectorizing SUIF compiler, implementation and performance , 1997 .
[14] Jaewook Shin,et al. Exploiting Superword-Level Locality in Multimedia Extension Architectures , 2003, J. Instr. Level Parallelism.
[15] Ayal Zaks,et al. Outer-loop vectorization - revisited for short SIMD architectures , 2008, 2008 International Conference on Parallel Architectures and Compilation Techniques (PACT).
[16] Michael Kagan,et al. The Pentium" Processor with MMXTM Technology , 1997 .
[17] R. Govindarajan,et al. A Vectorizing Compiler for Multimedia Extensions , 2000, International Journal of Parallel Programming.
[18] Corinna G. Lee,et al. Initial results on the performance and cost of vector microprocessors , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.
[19] Michael Weiss. Strip mining on SIMD architectures , 1991, ICS '91.
[20] Steven W. K. Tjiang,et al. SUIF: an infrastructure for research on parallelizing and optimizing compilers , 1994, SIGP.
[21] Aart J. C. Bik,et al. Automatic Intra-Register Vectorization for the Intel® Architecture , 2002, International Journal of Parallel Programming.
[22] Francisco Tirado,et al. Improving superword level parallelism support in modern compilers , 2005, 2005 Third IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS'05).
[23] Corinna G. Lee,et al. Simple vector microprocessors for multimedia applications , 1998, Proceedings. 31st Annual ACM/IEEE International Symposium on Microarchitecture.
[24] Fred Weber,et al. AMD 3DNow! technology: architecture and implementations , 1999, IEEE Micro.
[25] Franz Franchetti,et al. Data Layout Transformation for Stencil Computations on Short-Vector SIMD Architectures , 2011, CC.
[26] Samuel Larsen,et al. Compilation techniques for short-vector instructions , 2006 .
[27] Uday Bondhugula,et al. A practical automatic polyhedral parallelizer and locality optimizer , 2008, PLDI '08.
[28] Peng Wu,et al. Efficient SIMD code generation for runtime alignment and length conversion , 2005, International Symposium on Code Generation and Optimization.
[29] Jaewook Shin,et al. Superword-level parallelism in the presence of control flow , 2005, International Symposium on Code Generation and Optimization.