Flexible VLIW processor based on FPGA for efficient embedded real-time image processing

Modern field programmable gate array (FPGA) chips, with their larger memory capacity and reconfigurability potential, are opening new frontiers in rapid prototyping of embedded systems. With the advent of high-density FPGAs, it is now possible to implement a high-performance VLIW (very long instruction word) processor core in an FPGA. With VLIW architecture, the processor effectiveness depends on the ability of compilers to provide sufficient ILP (instruction-level parallelism) from program code. This paper describes research result about enabling the VLIW processor model for real-time processing applications by exploiting FPGA technology. Our goals are to keep the flexibility of processors to shorten the development cycle, and to use the powerful FPGA resources to increase real-time performance. We present a flexible VLIW VHDL processor model with a variable instruction set and a customizable architecture which allows exploiting intrinsic parallelism of a target application using advanced compiler technology and implementing it in an optimal manner on FPGA. Some common algorithms of image processing were tested and validated using the proposed development cycle. We also realized the rapid prototyping of embedded contactless palmprint extraction on an FPGA Virtex-6 based board for a biometric application and obtained a processing time of 145.6 ms per image. Our approach applies some criteria for co-design tools: flexibility, modularity, performance, and reusability.

[1]  José Luis Lázaro,et al.  Efficient Smart CMOS Camera Based on FPGAs Oriented to Embedded Image Processing , 2011, Sensors.

[2]  Fan Yang,et al.  A modular VLIW Processor , 2007, 2007 IEEE International Symposium on Circuits and Systems.

[3]  Abbes Amira,et al.  FPGA implementations of fast fourier transforms for real-time signal and image processing , 2003, Proceedings. 2003 IEEE International Conference on Field-Programmable Technology (FPT) (IEEE Cat. No.03EX798).

[4]  Muhammad Shafique,et al.  KAHRISMA: A Novel Hypermorphic Reconfigurable-Instruction-Set Multi-grained-Array Architecture , 2010, 2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010).

[5]  Preeti Ranjan Panda,et al.  SystemC - a modeling platform supporting multiple design abstractions , 2001, International Symposium on System Synthesis (IEEE Cat. No.01EX526).

[6]  Thambipillai Srikanthan,et al.  Rapid design of area-efficient custom instructions for reconfigurable embedded processing , 2009, J. Syst. Archit..

[7]  Tei-Wei Kuo,et al.  Processing element allocation and dynamic scheduling codesign for multi-function SoCs , 2009, Real-Time Systems.

[8]  Philip P. Dang VLSI architecture for real-time image and video processing systems , 2006, Journal of Real-Time Image Processing.

[9]  Adel Belouchrani,et al.  An FPGA based soft multiprocessor for DNS/DNSSEC authoritative server , 2011, Microprocess. Microsystems.

[10]  Mariano Fons,et al.  FPGA-based Personal Authentication Using Fingerprints , 2012, J. Signal Process. Syst..

[11]  Abbes Amira,et al.  FPGA-based IP cores implementation for face recognition using dynamic partial reconfiguration , 2011, Journal of Real-Time Image Processing.

[12]  Antonio Torralba,et al.  Microprocessor and FPGA interfaces for in-system co-debugging in field programmable hybrid systems , 2005, Microprocess. Microsystems.

[13]  Didier Demigny,et al.  Reconfigurable computing: design methodology and hardware tasks scheduling for real-time image processing , 2008, Journal of Real-Time Image Processing.

[14]  B. Ramakrishna Rau,et al.  HMDES Version 2.0 Specification , 1996 .

[15]  Guang Deng,et al.  Fast buffering for FPGA implementation of vision-based object recognition systems , 2011, Journal of Real-Time Image Processing.

[16]  B. E. Wells,et al.  Handel-C for rapid prototyping of VLSI coprocessors for real time systems , 2002, Proceedings of the Thirty-Fourth Southeastern Symposium on System Theory (Cat. No.02EX540).

[17]  Theo Ungerer,et al.  Automatic multi-objective optimization of parameters for hardware and code optimizations , 2011, 2011 International Conference on High Performance Computing & Simulation.

[18]  Abbes Amira,et al.  FPGA implementations of fast Fourier transforms for real-time signal and image processing , 2005 .

[19]  Fan Yang,et al.  Palmprint and face score level fusion: hardware implementation of a contactless small sample biometric system , 2011 .

[20]  Zhuang Fu,et al.  A pipelined architecture for real time correction of non-uniformity in infrared focal plane arrays imaging system using multiprocessors , 2010 .

[21]  Fan Yang,et al.  Concept and Development of Modular VLIW Processor Based on FPGA , 2010, 2010 Second International Conference on Computer and Network Technology.

[22]  Theo Ungerer,et al.  The Two-dimensional Superscalar GAP Processor Architecture , 2010 .

[23]  H. JoséAntonioMartín,et al.  FPGA-Based Multimodal Embedded Sensor System Integrating Low- and Mid-Level Vision , 2011, Sensors.

[24]  Paolo Faraboschi,et al.  Embedded Computing: A VLIW Approach to Architecture, Compilers and Tools , 2004 .