Fusing convolution kernels through tiling
暂无分享,去创建一个
[1] Christopher G. Harris,et al. A Combined Corner and Edge Detector , 1988, Alvey Vision Conference.
[2] P. Sadayappan,et al. High-performance code generation for stencil computations on GPU architectures , 2012, ICS '12.
[3] Albert Cohen,et al. Split tiling for GPUs: automatic parallelization using trapezoidal tiles , 2013, GPGPU@ASPLOS.
[4] Uday Bondhugula,et al. PLuTo: A Practical and Fully Automatic Polyhedral Program Optimization System , 2015 .
[5] John F. Canny,et al. A Computational Approach to Edge Detection , 1986, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[6] Pierre G. Paulin,et al. A novel compilation approach for image processing graphs on a many-core platform with explicitly managed memory , 2013, 2013 International Conference on Compilers, Architecture and Synthesis for Embedded Systems (CASES).
[7] Vinod Grover,et al. Forma: a DSL for image processing applications to target GPUs and multi-core CPUs , 2015, GPGPU@PPoPP.
[8] Frédo Durand,et al. Halide: a language and compiler for optimizing parallelism, locality, and recomputation in image processing pipelines , 2013, PLDI 2013.
[9] Gordon L. Kindlmann,et al. Diderot: a parallel DSL for image analysis and visualization , 2012, PLDI.
[10] Uday Bondhugula,et al. PolyMage: Automatic Optimization for Image Processing Pipelines , 2015, ASPLOS.
[11] Richard Veras,et al. A stencil compiler for short-vector SIMD architectures , 2013, ICS '13.
[12] Jason Cong,et al. Polyhedral-based data reuse optimization for configurable computing , 2013, FPGA '13.
[13] Uday Bondhugula,et al. Effective automatic parallelization of stencil computations , 2007, PLDI '07.