Efficient SIMD code generation for irregular kernels
暂无分享,去创建一个
[1] Hongbin Zheng,et al. Polly – Polyhedral optimization in LLVM , 2012 .
[2] Wonyong Sung,et al. Efficient vectorization of SIMD programs with non-aligned and irregular data access hardware , 2008, CASES '08.
[3] Vivek Sarkar,et al. Efficient Selection of Vector Instructions Using Dynamic Programming , 2010, 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture.
[4] Joel H. Saltz,et al. Communication Optimizations for Irregular Scientific Computations on Distributed Memory Architectures , 1994, J. Parallel Distributed Comput..
[5] Andreas Krall,et al. Compilation Techniques for Multimedia Processors , 2004, International Journal of Parallel Programming.
[6] Vikram S. Adve,et al. Macroscopic Data Structure Analysis and Optimization , 2005 .
[7] Rainer Leupers. Code selection for media processors with SIMD instructions , 2000, DATE '00.
[8] Andreas Krall,et al. Pointer Alignment Analysis for Processors with SIMD Instructions , 2003 .
[9] Peng Zhao,et al. An integrated simdization framework using virtual vectors , 2005, ICS '05.
[10] Ken Kennedy,et al. Optimizing Compilers for Modern Architectures: A Dependence-based Approach , 2001 .
[11] Gang Ren,et al. Optimizing data permutations for SIMD devices , 2006, PLDI '06.
[12] Rodric M. Rabbah,et al. Exploiting vector parallelism in software pipelined loops , 2005, 38th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'05).
[13] Martin Hopkins,et al. Synergistic Processing in Cell's Multicore Architecture , 2006, IEEE Micro.
[14] Gang Ren,et al. A Preliminary Study on the Vectorization of Multimedia Applications for Multimedia Extensions , 2003, LCPC.
[15] John Shalf,et al. Exascale Computing Technology Challenges , 2010, VECPAR.
[16] Robert E. Tarjan,et al. Depth-First Search and Linear Graph Algorithms , 1972, SIAM J. Comput..
[17] R. Govindarajan,et al. A Vectorizing Compiler for Multimedia Extensions , 2000, International Journal of Parallel Programming.
[18] Saman P. Amarasinghe,et al. Exploiting superword level parallelism with multimedia instruction sets , 2000, PLDI '00.
[19] Peng Wu,et al. Vectorization for SIMD architectures with alignment constraints , 2004, PLDI '04.
[20] Ayal Zaks,et al. Auto-vectorization of interleaved data for SIMD , 2006, PLDI '06.
[21] Vikram S. Adve,et al. LLVM: a compilation framework for lifelong program analysis & transformation , 2004, International Symposium on Code Generation and Optimization, 2004. CGO 2004..
[22] Hunter Scales,et al. AltiVec Extension to PowerPC Accelerates Media Processing , 2000, IEEE Micro.
[23] Ayal Zaks,et al. Vectorizing for a SIMdD DSP architecture , 2003, CASES '03.
[24] John L. Henning. SPEC CPU2006 benchmark descriptions , 2006, CARN.