A Low-Power FPGA Based on Autonomous Fine-Grain Power Gating

This paper presents a field-programmable gate array (FPGA) based on lookup table level fine-grain power gating with small overheads. The power gating technique implemented in the proposed architecture can directly detect the activity of each look-up-table easily by exploiting features of asynchronous architectures. Moreover, detecting the data arrival in advance prevents the delay increase for waking-up and the power consumption of unnecessary power switching. Since the power gating technique has small overheads, the granularity size of a power-gated domain is as fine as a single two-input and one-output lookup table. The proposed FPGA is fabricated using the ASPLA 90-nm CMOS process with dual threshold voltages. We use an image processing application called “template matching” for evaluation. Since the proposed FPGA is suitable for processing where the workload changes dynamically, an adaptive algorithm where a small computational kernel is employed. Compared to a synchronous FPGA and an asynchronous FPGA without power gating, the power consumption is reduced respectively by 38% and 15% at 85°C.

[1]  Masanori Hariyama,et al.  A low-power FPGA based on autonomous fine-grain power-gating , 2009, ASP-DAC 2009.

[2]  Pradip Bose,et al.  Microarchitectural techniques for power gating of execution units , 2004, Proceedings of the 2004 International Symposium on Low Power Electronics and Design (IEEE Cat. No.04TH8758).

[3]  Kapilan Maheswaran Venkatesh Akella PGA-STC: programmable gate array for implementing self-timed circuits , 1998 .

[4]  Masanori Hariyama,et al.  Field-Programmable VLSI Based on a Bit-Serial Fine-Grain Architecture , 2004 .

[5]  Wayne Luk,et al.  Wave-pipelined signaling for on-FPGA communication , 2008, 2008 International Conference on Field-Programmable Technology.

[6]  Guy Lemieux,et al.  Towards reliable 5Gbps wave-pipelined and 3Gbps surfing interconnect in 65nm FPGAs , 2009, FPGA '09.

[7]  Steven Trimberger,et al.  Determination of Power Gating Granularity for FPGA Fabric , 2006, IEEE Custom Integrated Circuits Conference 2006.

[8]  Masanori Hariyama,et al.  PAPER Special Section on Advanced Processors Based on Novel Concepts in Computation Evaluation of a Field-Programmable VLSI Based on an Asynchronous Bit-Serial Architecture , 2007 .

[9]  Masanori Hariyama,et al.  An Asynchronous Field-Programmable VLSI Using LEDR/4-Phase-Dual-Rail Protocol Converters , 2009, ERSA.

[10]  Rajit Manohar Reconfigurable Asynchronous Logic , 2006, IEEE Custom Integrated Circuits Conference 2006.

[11]  Masanori Hariyama,et al.  High-performance field programmable VLSI processor based on a direct allocation of a control/data flow graph , 2002, Proceedings IEEE Computer Society Annual Symposium on VLSI. New Paradigms for VLSI Systems Design. ISVLSI 2002.

[12]  Masanori Hariyama,et al.  A Field-programmable VLSI based on an asynchronous bit-serial architecture , 2007, 2007 IEEE Asian Solid-State Circuits Conference.

[13]  Wayne Luk,et al.  Implementation of Wave-Pipelined Interconnects in FPGAs , 2008, Second ACM/IEEE International Symposium on Networks-on-Chip (nocs 2008).

[14]  Donald E. Thomas,et al.  Architectural Partitioning for System Level Design , 1989, 26th ACM/IEEE Design Automation Conference.

[15]  John Teifel,et al.  An asynchronous dataflow FPGA architecture , 2004, IEEE Transactions on Computers.

[16]  Steve Furber,et al.  Principles of Asynchronous Circuit Design: A Systems Perspective , 2010 .

[17]  Saibal Mukhopadhyay,et al.  Leakage current mechanisms and leakage reduction techniques in deep-submicrometer CMOS circuits , 2003, Proc. IEEE.

[18]  Brad Calder,et al.  Transition phase classification and prediction , 2005, 11th International Symposium on High-Performance Computer Architecture.

[19]  David L. Dill,et al.  Efficient self-timing with level-encoded 2-phase dual-rail (LEDR) , 1991 .

[20]  Stewart Smith,et al.  Serial-Data Computation , 1987 .

[21]  Steve Goddard,et al.  Online energy-aware I/O device scheduling for hard real-time systems , 2006, Proceedings of the Design Automation & Test in Europe Conference.

[22]  Tadayoshi Enomoto,et al.  Fast motion estimation algorithm and low-power CMOS motion estimation array LSI for MPEG-2 encoding , 1999, ISCAS'99. Proceedings of the 1999 IEEE International Symposium on Circuits and Systems VLSI (Cat. No.99CH36349).

[23]  George Varghese,et al.  The design of a low energy FPGA , 1999, Proceedings. 1999 International Symposium on Low Power Electronics and Design (Cat. No.99TH8477).

[24]  Rob Payne,et al.  Self-Timed FPGA Systems , 1995, FPL.

[25]  Steven Trimberger,et al.  A 90-nm Low-Power FPGA for Battery-Powered Applications , 2006, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[26]  Jens Sparsø,et al.  Principles of Asynchronous Circuit Design , 2001 .

[27]  Kamran Eshraghian,et al.  Principles of CMOS VLSI Design: A Systems Perspective , 1985 .

[28]  George Varghese,et al.  Catching accurate profiles in hardware , 2003, The Ninth International Symposium on High-Performance Computer Architecture, 2003. HPCA-9 2003. Proceedings..

[29]  M. Kameyama,et al.  A low-power field-programmable VLSI based on a fine-grained power-gating scheme , 2008, 2008 51st Midwest Symposium on Circuits and Systems.

[30]  Masanori Hariyama,et al.  Multi-Context FPGA Using Fine-Grained Interconnection Blocks and Its CAD Environment , 2008, IEICE Trans. Electron..

[31]  Mohamed I. Elmasry,et al.  Dynamic Standby Prediction for Leakage Tolerant Microprocessor Functional Units , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).

[32]  K. Banerjee,et al.  Supply and power optimization in leakage-dominant technologies , 2005, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[33]  Mohamed I. Elmasry,et al.  A Comparative Study Between Static and Dynamic Sleep Signal Generation Techniques for Leakage Tolerant Designs , 2008, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[34]  Robert C. Aitken,et al.  Low Power Methodology Manual - for System-on-Chip Design , 2007 .

[35]  Yan Zhang,et al.  Clock-Gating in FPGAs: A Novel and Comparative Evaluation , 2006, 9th EUROMICRO Conference on Digital System Design (DSD'06).