An evaluation of current SIMD programming models for C++
暂无分享,去创建一个
[1] Vivek Sarkar,et al. Efficient Selection of Vector Instructions Using Dynamic Programming , 2010, 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture.
[2] Michael D. McCool,et al. Intel's Array Building Blocks: A retargetable, dynamic compiler and embedded language , 2011, International Symposium on Code Generation and Optimization (CGO 2011).
[3] Noah Treuhaft,et al. Scalable Processors in the Billion-Transistor Era: IRAM , 1997, Computer.
[4] José E. Moreira,et al. Simple, portable and fast SIMD intrinsic programming: generic simd library , 2014, WPMVP '14.
[5] Saman P. Amarasinghe,et al. Exploiting superword level parallelism with multimedia instruction sets , 2000, PLDI '00.
[6] Peng Wu,et al. Vectorization for SIMD architectures with alignment constraints , 2004, PLDI '04.
[7] Brigitte Rozoy,et al. Boost.SIMD: generic programming for portable SIMDization , 2012, PACT '12.
[8] Magnus Jahre,et al. Optimized hardware for suboptimal software: The case for SIMD-aware benchmarks , 2014, 2014 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).
[9] David A. Padua,et al. An Evaluation of Vectorizing Compilers , 2011, 2011 International Conference on Parallel Architectures and Compilation Techniques.
[10] M. Pharr,et al. ispc: A SPMD compiler for high-performance CPU programming , 2012, 2012 Innovative Parallel Computing (InPar).
[11] Ayal Zaks,et al. Outer-loop vectorization - revisited for short SIMD architectures , 2008, 2008 International Conference on Parallel Architectures and Compilation Techniques (PACT).
[12] Gang Ren,et al. Optimizing data permutations for SIMD devices , 2006, PLDI '06.
[13] Volker Lindenstruth,et al. Vc: A C++ library for explicit vectorization , 2012, Softw. Pract. Exp..
[14] Timothy M. Jones,et al. Throttling Automatic Vectorization: When Less is More , 2015, 2015 International Conference on Parallel Architecture and Compilation (PACT).
[15] Ben H. H. Juurlink,et al. SIMD Acceleration for HEVC Decoding , 2015, IEEE Transactions on Circuits and Systems for Video Technology.
[16] Timothée Ewart,et al. Cyme: A Library Maximizing SIMD Computation on User-Defined Containers , 2014, ISC.
[17] Mahmut T. Kandemir,et al. A compiler framework for extracting superword level parallelism , 2012, PLDI '12.
[18] Alan Jay Smith,et al. Multimedia Instruction Sets for General Purpose Microprocessors: a , 2000 .
[19] Ingo Wald,et al. Extending a C-like language for portable SIMD programming , 2012, PPoPP '12.
[20] Jaewook Shin,et al. Superword-level parallelism in the presence of control flow , 2005, International Symposium on Code Generation and Optimization.
[21] Peng Wu,et al. Efficient SIMD code generation for runtime alignment and length conversion , 2005, International Symposium on Code Generation and Optimization.
[22] Arch D. Robison,et al. Composable Parallel Patterns with Intel Cilk Plus , 2013, Computing in Science & Engineering.