Auto-vectorization for image processing DSLs
暂无分享,去创建一个
Jürgen Teich | Frank Hannig | Oliver Reiche | Christof Kobylko | J. Teich | Frank Hannig | Christof Kobylko | Oliver Reiche
[1] Jürgen Teich,et al. ExaSlang: A Domain-Specific Language for Highly Scalable Multigrid Solvers , 2014, 2014 Fourth International Workshop on Domain-Specific Languages and High-Level Frameworks for High Performance Computing.
[2] Fridtjof Stein,et al. Efficient Computation of Optical Flow Using the Census Transform , 2004, DAGM-Symposium.
[3] Hao Zhou,et al. Loop-oriented array- and field-sensitive pointer analysis for automatic SIMD vectorization , 2016, LCTES.
[4] Xinmin Tian,et al. Reducing the Functionality Gap Between Auto-Vectorization and Explicit Vectorization - Compress/Expand and Histogram , 2016, IWOMP.
[5] Sebastian Hack,et al. Sierra: a SIMD extension for C++ , 2014, WPMVP '14.
[6] Yosi Ben-Asher,et al. Hybrid type legalization for a sparse SIMD instruction set , 2013, ACM Trans. Archit. Code Optim..
[7] Jack J. Dongarra,et al. A comparative study of automatic vectorizing compilers , 1991, Parallel Comput..
[8] H. Jensen. Night Rendering , 2000 .
[9] Richard Henderson,et al. Multi-platform auto-vectorization , 2006, International Symposium on Code Generation and Optimization (CGO'06).
[10] Jaewook Shin,et al. Superword-level parallelism in the presence of control flow , 2005, International Symposium on Code Generation and Optimization.
[11] Nalini Vasudevan,et al. FlexVec: auto-vectorization for irregular loops , 2016, PLDI.
[12] Sebastian Hack,et al. Whole-function vectorization , 2011, International Symposium on Code Generation and Optimization (CGO 2011).
[13] Jürgen Teich,et al. HIPAcc: A Domain-Specific Language and Compiler for Image Processing , 2016, IEEE Transactions on Parallel and Distributed Systems.
[14] R. Govindarajan,et al. A Vectorizing Compiler for Multimedia Extensions , 2000, International Journal of Parallel Programming.
[15] Ken Kennedy,et al. Conversion of control dependence to data dependence , 1983, POPL '83.
[16] Ayal Zaks,et al. Outer-loop vectorization - revisited for short SIMD architectures , 2008, 2008 International Conference on Parallel Architectures and Compilation Techniques (PACT).
[17] David Padua,et al. Encyclopedia of Parallel Computing , 2011 .
[18] Frédo Durand,et al. Halide: a language and compiler for optimizing parallelism, locality, and recomputation in image processing pipelines , 2013, PLDI 2013.
[19] Christopher G. Harris,et al. A Combined Corner and Edge Detector , 1988, Alvey Vision Conference.
[20] Andreas Krall,et al. Compilation Techniques for Multimedia Processors , 2004, International Journal of Parallel Programming.
[21] Sebastian Hack,et al. Improving Performance of OpenCL on CPUs , 2012, CC.
[22] Saman P. Amarasinghe,et al. Exploiting superword level parallelism with multimedia instruction sets , 2000, PLDI '00.
[23] Mark J. Shensa,et al. The discrete wavelet transform: wedding the a trous and Mallat algorithms , 1992, IEEE Trans. Signal Process..
[24] M. Pharr,et al. ispc: A SPMD compiler for high-performance CPU programming , 2012, 2012 Innovative Parallel Computing (InPar).