An efficient and compact row buffer architecture on FPGA for real-time neighbourhood image processing

This work presents a compact and efficient row buffer (RB) architecture on field-programmable gate array (FPGA). The design confines multiple RBs within the full capacity of Xilinx Block RAM (BRAM) in contrast to the conventional approach which partially utilizes a full BRAM per RB. The configuration of BRAM with different port aspect ratio and its data accessing via an efficient pattern generator circuitry allows the design to buffer pixelwise image data and retrieve multiple pixels per clock in a predefined pattern to achieve the functionality of multiple RBs. The design uses smallest BRAM18 primitive to be scaled in small steps for any larger kernel and image size for providing the best economical solution. The proposed architecture retains the bandwidth requirement to 1 pixel/clock at an ideal efficiency of 1 clock/pixel along with the saving of up to 87.5% BRAMs as compared to the conventional RBs and at the same time sustains high frame rates ($$1920\times 1080$$1920×1080 @ 217 fps) to support real-time image processing. Therefore, it is feasible to replace conventional high-cost RBs with our proposed RBs on latest FPGA devices especially for high performance yet area constraint neighbourhood image processing applications.

[1]  Ernest Jamro,et al.  Implementation image data convolutions operations in FPGA reconfigurable structures for real-time vision systems , 2000, Proceedings International Conference on Information Technology: Coding and Computing (Cat. No.PR00540).

[2]  Thomas Martin Deserno,et al.  Biomedical Image Processing , 2013 .

[3]  César Torres-Huitzil,et al.  Real-time image processing with a compact FPGA-based systolic architecture , 2004, Real Time Imaging.

[4]  Hui Zhang,et al.  A Multiwindow Partial Buffering Scheme for FPGA-Based 2-D Convolvers , 2007, IEEE Transactions on Circuits and Systems II: Express Briefs.

[5]  Francisco Cardells-Tormo,et al.  Area-efficient 2-D shift-variant convolvers for FPGA-based digital image processing , 2005, IEEE Workshop on Signal Processing Systems Design and Implementation, 2005..

[6]  John N. Lygouras,et al.  Fully pipelined FPGA-based architecture for real-time SIFT extraction , 2016, Microprocess. Microsystems.

[7]  Wolfgang Rosenstiel,et al.  A real time video processing framework for hardware realization of neighborhood operations with FPGAs , 2011, Proceedings of 21st International Conference Radioelektronika 2011.

[8]  Marc Reichenbach,et al.  A SMART CAMERA PROCESSING PIPELINE FOR IMAGE APPLICATIONS UTILIZING MARCHING PIXELS , 2011 .

[9]  Dirk Stroobandt,et al.  Optimizing the FPGA Memory Design for a Sobel Edge Detector , 2009, ERSA.

[10]  Mark A. Haidekker,et al.  Advanced Biomedical Image Analysis , 2010 .

[11]  E. G. Chikirdin,et al.  Principles of choice of nomenclature and spatial arrangement of roentgenologic equipment , 1980 .

[12]  Guang Deng,et al.  Fast buffering for FPGA implementation of vision-based object recognition systems , 2011, Journal of Real-Time Image Processing.

[13]  Mohammad Faizal Ahmad Fauzi,et al.  Lung segmentation on standard and mobile chest radiographs using oriented Gaussian derivatives filter , 2015, BioMedical Engineering OnLine.

[14]  Yvon Savaria,et al.  Reconfigurable pipelined 2-D convolvers for fast digital signal processing , 1999, IEEE Trans. Very Large Scale Integr. Syst..

[15]  Jia Zhang,et al.  Review of advanced FPGA architectures and technologies , 2014 .

[16]  Donald G. Bailey,et al.  Design for Embedded Image Processing on FPGAs: Bailey/Design for Embedded Image Processing on FPGAs , 2011 .

[17]  Magdy A. Bayoumi,et al.  Video Surveillance for Sensor Platforms - Algorithms and Architectures , 2013, Lecture Notes in Electrical Engineering.

[18]  Wolfgang Rosenstiel,et al.  Optimized hardware architecture of a smart camera with novel cyclic image line storage structures for morphological raster scan image processing , 2012, 2012 IEEE International Conference on Emerging Signal Processing Applications.

[19]  Eduardo Ros,et al.  A Comparison of FPGA and GPU for Real-Time Phase-Based Optical Flow, Stereo, and Local Image Features , 2012, IEEE Transactions on Computers.

[20]  Milan Sonka,et al.  Image Processing, Analysis and Machine Vision , 1993, Springer US.

[21]  Tim Güneysu,et al.  DSPs, BRAMs, and a Pinch of Logic: Extended Recipes for AES on FPGAs , 2010, TRETS.

[22]  Jack Jean,et al.  Data Buffering and Allocation in Mapping Generalized Template Matching on Reconfigurable Systems , 2004, The Journal of Supercomputing.

[23]  Majida Kazmi,et al.  A Low Cost Structurally Optimized Design for Diverse Filter Types , 2016, PloS one.

[24]  ResourcesKen Chapman Multiplexer Design Techniques for Datapath Performance with Minimized Routing , 2012 .

[25]  Greg Brown,et al.  A Tradeoff Analysis of FPGAs, GPUs, and Multicores for Sliding-Window Applications , 2015, TRETS.

[26]  Qiang Liu,et al.  Combining Data Reuse With Data-Level Parallelization for FPGA-Targeted Hardware Compilation: A Geometric Programming Framework , 2008, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[27]  Donald G. Bailey,et al.  Design for Embedded Image Processing on FPGAs , 2011 .

[28]  Eduardo Ros,et al.  High-Performance Optical-Flow Architecture Based on a Multi-Scale, Multi-Orientation Phase-Based Model , 2010, IEEE Transactions on Circuits and Systems for Video Technology.