Efficient Computational Scheduling of Box and Gaussian FIR Filtering for CPU Microarchitecture

In this paper, we propose efficient computational scheduling of box and Gaussian filtering. These filters are fundamental tools and used for various applications. The computational order of the naïve implementations of these FIR filters are $O(r^{2})$, where $r$ is the kernel radius. A separable implementation reduces the order into $O(r)$ but requires twice times of filtering. A recursive representation dramatically sheds the order into $O(1)$ but also needs twice or more times filtering. The efficient representation curtails the number of arithmetic operations; however, the influence of data I/O for the computational time becomes dominant. In this paper, we optimize the computational scheduling of $O(1)$ box and Gaussian filters to competently utilize cache memory for reducing the computational time of data I/O. Experimental results show that the proposed scheduling has higher computational performance than the conventional implementation.

[1]  Seisuke Kyochi,et al.  Universal Approach for DCT-Based Constant-Time Gaussian Filter with Moment Preservation , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[2]  Yutaka Ishibashi,et al.  Filter based alpha matting for depth image based rendering , 2013, 2013 Visual Communications and Image Processing (VCIP).

[3]  Franklin C. Crow,et al.  Summed-area tables for texture mapping , 1984, SIGGRAPH.

[4]  Hiroshi Matsuo,et al.  Taxonomy of Vectorization Patterns of Programming for FIR Image Filters Using Kernel Subsampling and New One , 2018 .

[5]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[6]  Akihiko Kasagi,et al.  Parallel Algorithms for the Summed Area Table on the Asynchronous Hierarchical Memory Machine, with GPU implementations , 2014, 2014 43rd International Conference on Parallel Processing.

[7]  Jinhui Tang,et al.  Hardware-Efficient Guided Image Filtering for Multi-label Problem , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Mohinder Malhotra Single Image Haze Removal Using Dark Channel Prior , 2016 .

[9]  Carsten Rother,et al.  Fast cost-volume filtering for visual correspondence and beyond , 2011, CVPR 2011.

[10]  Diego F. Nehab,et al.  Parallel recursive filtering of infinite input extensions , 2016, ACM Trans. Graph..

[11]  Norishige Fukushima,et al.  Fast Implementation of Box Filtering , 2016 .

[12]  Vitaly Kober,et al.  Fast algorithms for the computation of sliding discrete sinusoidal transforms , 2004, IEEE Transactions on Signal Processing.

[13]  Xiaopeng Zhang,et al.  Fully Connected Guided Image Filtering , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[14]  Alexei A. Efros,et al.  Fast bilateral filtering for the display of high-dynamic-range images , 2002 .

[15]  Kunal Narayan Chaudhury,et al.  Constant-Time Filtering Using Shiftable Kernels , 2011, IEEE Signal Processing Letters.

[16]  Changming Sun,et al.  Multipoint Filtering with Local Polynomial Approximation and Range Guidance , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Sei-ichiro Kamata,et al.  Compressive Bilateral Filtering , 2015, IEEE Transactions on Image Processing.

[18]  Jian Sun,et al.  Guided Image Filtering , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Michael Unser,et al.  Fast $O(1)$ Bilateral Filtering Using Trigonometric Range Kernels , 2011, IEEE Transactions on Image Processing.

[20]  Minh N. Do,et al.  Cross-based local multipoint filtering , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[21]  Frédo Durand,et al.  Decoupling algorithms from schedules for easy optimization of image processing pipelines , 2012, ACM Trans. Graph..

[22]  Frédo Durand,et al.  Compiling high performance recursive filters , 2015, HPG '15.

[23]  Pat Hanrahan,et al.  Darkroom , 2014, ACM Trans. Graph..

[24]  Yoshihiro Maeda,et al.  Principal Component Analysis for Acceleration of Color Guided Image Filtering , 2018 .

[25]  Manuel Menezes de Oliveira Neto,et al.  Domain transform for edge-aware image and video processing , 2011, ACM Trans. Graph..

[26]  Eero P. Simoncelli,et al.  Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[27]  Sei-ichiro Kamata,et al.  Fast Gaussian filter with second-order shift property of DCT-5 , 2013, 2013 IEEE International Conference on Image Processing.

[28]  Frédo Durand,et al.  Halide: a language and compiler for optimizing parallelism, locality, and recomputation in image processing pipelines , 2013, PLDI 2013.

[29]  D. Scharstein,et al.  A Taxonomy and Evaluation of Dense Two-Frame Stereo Correspondence Algorithms , 2001, Proceedings IEEE Workshop on Stereo and Multi-Baseline Vision (SMBV 2001).

[30]  Norishige Fukushima,et al.  High-dimensional Guided Image Filtering , 2016, VISIGRAPP.

[31]  Norishige Fukushima,et al.  Extending Guided Image Filtering for High-Dimensional Signals , 2016, VISIGRAPP.

[32]  Sei-ichiro Kamata,et al.  Fast bilateral filter for multichannel images via soft-assignment coding , 2016, 2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA).

[33]  Sei-ichiro Kamata,et al.  Guided Image Filtering with Arbitrary Window Function , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[34]  Roberto Manduchi,et al.  Bilateral filtering for gray and color images , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[35]  Luc Van Gool,et al.  Speeded-Up Robust Features (SURF) , 2008, Comput. Vis. Image Underst..

[36]  Christof Koch,et al.  A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .

[37]  Narendra Ahuja,et al.  Real-time O(1) bilateral filtering , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[38]  Rodolfo S. Lima,et al.  GPU-efficient recursive filtering and summed-area tables , 2011, SA '11.