Design of a pseudo-log image transform hardware accelerator in a high-level synthesis-based memory management framework

Abstract. The pseudo-log image transform belongs to a class of image processing kernels that generate memory references which are nonlinear functions of loop indices. Due to the nonlinearity of the memory references, the usual design methodologies do not allow efficient hardware implementation for nonlinear kernels. For optimized hardware implementation, these kernels require the creation of a customized memory hierarchy and efficient data/memory management strategy. We present the design and real-time hardware implementation of a pseudo-log image transform IP (hardware image processing engine) using a memory management framework. The framework generates a controller which efficiently manages input data movement in the form of tiles between off-chip main memory, on-chip memory, and the core processing unit. The framework can jointly optimize the memory hierarchy and the tile computation schedule to reduce on-chip memory requirements, to maximize throughput, and to increase data reuse for reducing off-chip memory bandwidth requirements. The algorithmic C++ description of the pseudo-log kernel is profiled in the framework to generate an enhanced description with a customized memory hierarchy. The enhanced description of the kernel is then used for high-level synthesis (HLS) to perform architectural design space exploration in order to find an optimal implementation under given performance constraints. The optimized register transfer level implementation of the IP generated after HLS is used for performance estimation. The performance estimation is done in a simulation framework to characterize the IP with different external off-chip memory latencies and a variety of data transfer policies. Experimental results show that the designed IP can be used for real-time implementation and that the generated memory hierarchy is capable of feeding the IP with a sufficiently high bandwidth even in the presence of long external memory latencies.

[1]  Florin Balasa,et al.  Mapping Multi-Dimensional Signals into Hierarchical Memory Organizations , 2007, 2007 Design, Automation & Test in Europe Conference & Exhibition.

[2]  F. Pardo,et al.  A new foveated space-variant camera for robotic applications , 1996, Proceedings of Third International Conference on Electronics, Circuits, and Systems.

[3]  Stéphane Mancini,et al.  Enhancing non-linear kernels by an optimized memory hierarchy in a High Level Synthesis flow , 2012, 2012 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[4]  Jeanny Herault Vision: Images, Signals and Neural Networks - Models of Neural Processing in Visual Perception , 2010 .

[5]  Chu Kiong Loo,et al.  FPGA Implementation of Log-polar Mapping , 2008, 2008 15th International Conference on Mechatronics and Machine Vision in Practice.

[6]  Juan Domingo,et al.  Detecting Motion Independent of the Camera Movement Through a Log-Polar Differential Approach , 1997, CAIP.

[7]  Alexandre Bernardino,et al.  A review of log-polar imaging for visual perception in robotics , 2010, Robotics and Autonomous Systems.

[8]  Sek M. Chai,et al.  Real-Time Fisheye Lens Distortion Correction Using Automatically Generated Streaming Accelerators , 2009, 2009 17th IEEE Symposium on Field Programmable Custom Computing Machines.

[9]  Martin White,et al.  MIP-Map Level Selection for Texture Mapping , 1998, IEEE Trans. Vis. Comput. Graph..

[10]  Erik Brockmeyer,et al.  Data and memory optimization techniques for embedded systems , 2001, TODE.

[11]  Giulio Sandini,et al.  On the Advantages of Polar and Log-Polar Mapping for Direct Estimation of Time-To-Impact from Optical Flow , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[12]  George Wolberg,et al.  Image registration using log-polar mappings for recovery of large-scale similarity and projective transformations , 2005, IEEE Transactions on Image Processing.

[13]  Leopoldo Altamirano Robles,et al.  FPGA-based Pipeline Architecture to Transform Cartesian Images into Foveal Images by Using a new Foveation Approach , 2006, 2006 IEEE International Conference on Reconfigurable Computing and FPGA's (ReConFig 2006).

[14]  Francky Catthoor,et al.  Storage Estimation and Design Space Exploration Methodologies for the Memory Management of Signal Processing Applications , 2008, J. Signal Process. Syst..