A dynamically reconfigurable asynchronous processor for low power applications

There is an increasing demand in high-throughput mobile applications for programmability and energy efficiency. Conventional mobile Central Processing Units (CPUs) and Very Long Instruction Word (VLIW) processors cannot meet these demands. In this paper, we present a novel dynamically reconfigurable processor that targets these requirements. The processor consists of a heterogeneous array of coarse grain asynchronous cells. The architecture maintains most of the benefits of custom asynchronous design, while also providing programmability via conventional high-level languages. When compared to an equivalent synchronous design, our processor results in a power reduction of up to 18%. Additionally, our processor delivers considerably lower power consumption when compared to a market leading VLIW and a low-power ARM processor, while maintaining their throughput performance. Our processor resulted in a reduction in power consumption over the ARM7 processor of around 9.5 times when running the bilinear demosaicing algorithm at the same throughput.

[1]  Tom Verhoeff,et al.  Delay-insensitive codes — an overview , 1988, Distributed Computing.

[2]  A.J. Viterbi A personal history of the Viterbi algorithm , 2006, IEEE Signal Processing Magazine.

[3]  Seth Copen Goldstein,et al.  PipeRench: a co/processor for streaming multimedia acceleration , 1999, ISCA.

[4]  Vaughn Betz,et al.  VPR: A new packing, placement and routing tool for FPGA research , 1997, FPL.

[5]  Rajit Manohar Reconfigurable Asynchronous Logic , 2006, IEEE Custom Integrated Circuits Conference 2006.

[6]  Bertil Svensson,et al.  Evolution in architectures and programming methodologies of coarse-grained reconfigurable computing , 2009, Microprocess. Microsystems.

[7]  Steve Furber,et al.  Principles of Asynchronous Circuit Design: A Systems Perspective , 2010 .

[8]  Stephen D. Brown,et al.  Flexibility of interconnection structures for field-programmable gate arrays , 1991 .

[9]  Seth Copen Goldstein,et al.  Tartan: evaluating spatial computation for whole program execution , 2006, ASPLOS XII.

[10]  Tughrul Arslan,et al.  System-level Scheduling on Instruction Cell Based Reconfigurable Systems , 2006, Proceedings of the Design Automation & Test in Europe Conference.

[11]  J. Tukey,et al.  An algorithm for the machine calculation of complex Fourier series , 1965 .

[12]  P. Groves,et al.  A 600 MHz VLIW DSP , 2002, 2002 IEEE International Solid-State Circuits Conference. Digest of Technical Papers (Cat. No.02CH37315).

[13]  Jens Sparsø,et al.  A router architecture for connection-oriented service guarantees in the MANGO clockless network-on-chip , 2005, Design, Automation and Test in Europe.

[14]  Steven M. Nowick,et al.  An introduction to asynchronous circuit design , 1998 .

[15]  Carl Ebeling,et al.  An FPGA for implementing asynchronous circuits , 1994, IEEE Design & Test of Computers.

[16]  Paul Day,et al.  Four-phase micropipeline latch control circuits , 1996, IEEE Trans. Very Large Scale Integr. Syst..

[17]  Tughrul Arslan,et al.  Implementation of Highly Pipelined Datapaths on a Reconfigurable Asynchronous Substrate , 2009, 2009 NASA/ESA Conference on Adaptive Hardware and Systems.

[18]  Kang Sun,et al.  Design of A Novel Asynchronous Reconfigurable Architecture for Cryptographic Applications , 2006, First International Multi-Symposiums on Computer and Computational Sciences (IMSCCS'06).

[19]  R. Payne,et al.  Asynchronous FPGA architectures , 1996 .

[20]  John Teifel,et al.  An asynchronous dataflow FPGA architecture , 2004, IEEE Transactions on Computers.

[21]  Lei Zhang,et al.  Image demosaicing: a systematic survey , 2008, Electronic Imaging.

[22]  Peter Thomas,et al.  An architecture for asynchronous FPGAs , 2003, Proceedings. 2003 IEEE International Conference on Field-Programmable Technology (FPT) (IEEE Cat. No.03EX798).

[23]  Roger Woods,et al.  Implementation of the 2D DCT using a Xilinx XC6264 FPGA , 1997, 1997 IEEE Workshop on Signal Processing Systems. SiPS 97 Design and Implementation formerly VLSI Signal Processing.

[24]  Ivo Bolsens,et al.  Proceedings of the conference on Design, Automation & Test in Europe , 2000 .

[25]  Tughrul Arslan,et al.  The Reconfigurable Instruction Cell Array , 2008, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[26]  Tughrul Arslan,et al.  Conditional Acknowledge Synchronisation in Asynchronous Interconnect Switch Design , 2009, 2009 NASA/ESA Conference on Adaptive Hardware and Systems.