FPGA based efficient on-chip memory for image processing algorithms

In Field Programmable Gate Array (FPGA) efficient utilization of on-chip Static Random Access Memory (SRAM) is extremely important for most applications especially for image processing. True Dual Port (TDP) SRAM and Single Port (SP) SRAM are typically available SRAMs for image processing algorithms. But in case of data access policy changes, the memories need to be redesigned. Hence on-chip memory architecture capable of scanning the data in different ways without redesigning is required. In the proposed sub-bank Dual Port (DP) memory architecture, SP SRAM has been modified to function as a TDP SRAM, with high throughput and less power consumption. It also provides higher level of abstraction suitable for image processing algorithms with the help of two-port memory control unit, clock and address generators. The proposed sub-bank memory architecture and its system is implemented and verified for Lapped Biorthogonal Transform based Low complexity Zerotree Codec (LBT-LZC), an image coding algorithm. By considering the significant factors such as resource utilization, time and power, the proposed system outperforms TDP SRAMs.

[1]  Fabio Solari,et al.  A phase-based stereo vision system-on-a-chip , 2007, Biosyst..

[2]  Bishop Brock,et al.  Dynamic Power Management for Embedded Systems , 2003 .

[3]  Gurindar S. Sohi,et al.  High-bandwidth data memory systems for superscalar processors , 1991, ASPLOS IV.

[4]  Stefania Perri,et al.  Efficient memory architecture for image processing , 2011, Int. J. Circuit Theory Appl..

[5]  Wayne Luk,et al.  Custom parallel caching schemes for hardware-accelerated image compression , 2008, Journal of Real-Time Image Processing.

[6]  George A. Constantinides,et al.  A Floating-point Extended Kalman Filter Implementation for Autonomous Mobile Robots , 2009, J. Signal Process. Syst..

[7]  Hassan Bajwa,et al.  An area -efficient, high-performance, low-power multi-port cache memory architecture , 2007 .

[8]  Amine Bermak,et al.  Novel VLSI implementation of Peano-Hilbert curve address generator , 2008, 2008 IEEE International Symposium on Circuits and Systems.

[9]  M. Valero,et al.  Design and implementation of high-performance memory systems for future packet buffers , 2003, Proceedings. 36th Annual IEEE/ACM International Symposium on Microarchitecture, 2003. MICRO-36..

[10]  R. Jacobsson Building Integrated Remote Control Systems for Electronics Boards , 2008, IEEE Transactions on Nuclear Science.

[11]  Xiangqing He,et al.  A novel area-efficient and full current-mode dual-port SRAM , 2008, 2008 International Conference on Communications, Circuits and Systems.

[12]  Maddu Karunaratne,et al.  Optimized BIST for embedded dual-port RAMs , 2010, 2010 53rd IEEE International Midwest Symposium on Circuits and Systems.

[13]  Donald G. Bailey,et al.  Design for Embedded Image Processing on FPGAs , 2011 .

[14]  R. Ernst,et al.  A mixed QoS SDRAM controller for FPGA-based high-end image processing , 2003, 2003 IEEE Workshop on Signal Processing Systems (IEEE Cat. No.03TH8682).

[15]  William Stallings Computer Organization and Architecture: Designing for Performance , 2010 .

[16]  Jordi Carrabina,et al.  A library of memory controllers for an image processing prototyping system , 1998, Proceedings. Ninth International Workshop on Rapid System Prototyping (Cat. No.98TB100237).

[17]  Javier Díaz,et al.  FPGA-based real-time optical-flow system , 2006, IEEE Transactions on Circuits and Systems for Video Technology.

[18]  Nito Mitra,et al.  Reference model , 1992 .

[19]  Erik Brockmeyer,et al.  Data and memory optimization techniques for embedded systems , 2001, TODE.

[20]  David S. Taubman,et al.  Optimal 2 sub-bank memory architecture for bit plane coder of JPEG2000 , 2005, 2005 IEEE International Symposium on Circuits and Systems.

[21]  Jonathan Rose,et al.  Measuring the Gap Between FPGAs and ASICs , 2007, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[22]  C. Vasanthanayaki,et al.  Image coding using lapped biorthogonal transform , 2013, Signal Image Video Process..

[23]  Frank Vahid,et al.  A quantitative analysis of the speedup factors of FPGAs over processors , 2004, FPGA '04.

[24]  Qiang Liu,et al.  Automatic On-chip Memory Minimization for Data Reuse , 2007 .

[25]  Volkan Kursun,et al.  Temperature-adaptive voltage scaling for enhanced energy efficiency in subthreshold memory arrays , 2009, Microelectron. J..

[26]  Scott Hauck,et al.  Reconfigurable computing: a survey of systems and software , 2002, CSUR.

[27]  Oliver Chiu-sing Choy,et al.  A Five-Stage Pipeline, 204 Cycles/MB, Single-Port SRAM-Based Deblocking Filter for H.264/AVC , 2008, IEEE Transactions on Circuits and Systems for Video Technology.

[28]  Norman P. Jouppi,et al.  Reconfigurable caches and their application to media processing , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).

[29]  Mariano Fons,et al.  Run-time self-reconfigurable 2D convolver for adaptive image processing , 2011, Microelectron. J..