Accelerating a modified Gaussian pyramid with a customized processor

Image pyramids are multi-scale representations of images, and their calculation is computationally intensive. They can be a main bottleneck in image processing and computer vision tasks such as edge detection and feature extraction. Thus, high speed computation of image pyramids is necessary. Moreover, when these algorithms are intended for embedded systems, other requirements such as area and energy consumption need to be satisfied. This paper presents a customized processor design to accelerate the execution of a stack of Gaussian low-pass filters. Using an instruction extension language, we added custom instructions to a 32-bit RISC-based configurable processor. We use three techniques to improve performance: operator fusion, single-instruction multiple-data vectorization and data reuse. The proposed processor achieves 12.3× speedup compared to the base processor, with 19% hardware overhead. The estimated improvement in energy consumption is 10.3×. The paper also presents the implementation results for the computation of a modified Gaussian pyramid in a tone mapping algorithm.

[1]  Yvon Savaria,et al.  Real-Time Computation of Local Neighborhood Functions in Application-Specific Instruction-Set Processors , 2012, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[2]  L. Petersson,et al.  Online stereo calibration using FPGAs , 2005, IEEE Proceedings. Intelligent Vehicles Symposium, 2005..

[3]  E. Reinhard Photographic Tone Reproduction for Digital Images , 2002 .

[4]  Rafael C. González,et al.  Digital image processing, 3rd Edition , 2008 .

[5]  Firas Hassan,et al.  A real-time implementation of gradient domain high dynamic range compression using a local Poisson solver , 2013, Journal of Real-Time Image Processing.

[6]  Kristian Ambrosch,et al.  Benchmarks of Low-Level Vision Algorithms for DSP, FPGA, and Mobile PC Processors , 2009 .

[7]  Dani Lischinski,et al.  Gradient Domain High Dynamic Range Compression , 2023 .

[8]  Seehyun Kim,et al.  Fixed-point optimization utility for C and C++ based digital signal processing programs , 1995, VLSI Signal Processing, VIII.

[9]  Jitendra Malik,et al.  Recovering high dynamic range radiance maps from photographs , 1997, SIGGRAPH.

[10]  Jong Won Park,et al.  An Efficient Memory System for the SIMD Construction of a Gaussian Pyramid , 1996, IEEE Trans. Parallel Distributed Syst..

[11]  Matthias Mielke,et al.  ASIC implementation of a Gaussian Pyramid for use in autonomous mobile robotics , 2011, 2011 IEEE 54th International Midwest Symposium on Circuits and Systems (MWSCAS).

[12]  Kingshuk Karuri Application analysis tools for ASIP design: application profiling and instruction-set customization / Kingshuk Karuri, Rainer Leupers , 2011 .

[13]  Martin Kraus,et al.  Pyramid Methods in GPU-Based Image Processing , 2011 .

[14]  Yvon Savaria,et al.  Customized embedded processor design for global photographic tone mapping , 2011, 2011 18th IEEE International Conference on Electronics, Circuits, and Systems.

[15]  Firas Hassan,et al.  An FPGA-based architecture for a local tone-mapping operator , 2007, Journal of Real-Time Image Processing.