PowerFITS: Reduce Dynamic and Static I-Cache Power Using Application Specific Instruction Set Synthesis

Power consumption, performance, area, and cost are critical concerns in designing microprocessors for embedded systems such as portable handheld computing and personal telecommunication devices. In previous work [A. Cheng et al., (2004)], we introduced the concept of framework-based instruction-set tuning synthesis (FITS), which is a new instruction synthesis paradigm that falls between a general-purpose embedded processor and a synthesized application specific processor (ASP). We address these design constraints through FITS by improving the code density. A FITS processor improves code density by tailoring the instruction set to the requirement of a target application to reduce the code size. This is achieved by replacing the fixed instruction and register decoding of general purpose embedded processor with programmable decoders that can achieve ASP performance, low power consumption, and compact chip area with the fabrication advantages of a mass produced single chip solution to amortize the cost. Instruction cache has been recognized as one of the most predominant source of power dissipation in a microprocessor. For instance, in Intel's StrongARMprocessor, 27% of total chip power loss goes into the instruction cache [J. Montanaro et al., (1996)]. In this paper, we demonstrate how FITS can be applied to improve the instruction cache power efficiency. Experimental results show that our synthesized instruction sets result in significant power reduction in the instruction cache compared to ARM instructions. For 21 benchmarks from the MiBench suite [M. Guthaus et al., (2001)], our simulation results indicate on average: a 49.4% saving for switching power; a 43.9% saving for internal power; a 14.9% saving for leakage power; a 46.6% saving for total cache power with up to 60.3% saving for peak power

[1]  A. Turing On Computable Numbers, with an Application to the Entscheidungsproblem. , 1937 .

[2]  Scott A. Mahlke,et al.  Processor Acceleration Through Automated Instruction Set Customization , 2003, MICRO.

[3]  Ricardo E. Gonzalez,et al.  Xtensa: A Configurable and Extensible Processor , 2000, IEEE Micro.

[4]  Luca Benini,et al.  Selective instruction compression for memory energy reduction in embedded systems , 1999, Proceedings. 1999 International Symposium on Low Power Electronics and Design (Cat. No.99TH8477).

[5]  Chris Weaver,et al.  CryptoManiac: a fast flexible architecture for secure communication , 2001, ISCA 2001.

[6]  Richard T. Witek,et al.  A 160 MHz 32 b 0.5 W CMOS RISC microprocessor , 1996, 1996 IEEE International Solid-State Circuits Conference. Digest of TEchnical Papers, ISSCC.

[7]  Yervant Zorian,et al.  2001 Technology Roadmap for Semiconductors , 2002, Computer.

[8]  Trevor N. Mudge,et al.  Reducing code size with run-time decompression , 2000, Proceedings Sixth International Symposium on High-Performance Computer Architecture. HPCA-6 (Cat. No.PR00550).

[9]  Trevor Mudge,et al.  MiBench: A free, commercially representative embedded benchmark suite , 2001 .

[10]  Trevor N. Mudge,et al.  Power: A First-Class Architectural Design Constraint , 2001, Computer.

[11]  Jörg Henkel,et al.  Code compression for low power embedded system design , 2000, Proceedings 37th Design Automation Conference.

[12]  Smaïl Niar,et al.  Impact of Code Compression on the Power Consumption in Embedded Systems , 2003, Embedded Systems and Applications.

[13]  A. Cozzolino,et al.  Powerpc microprocessor family: the programming environments , 1994 .

[14]  A. Church An Unsolvable Problem of Elementary Number Theory , 1936 .

[15]  Kevin D. Kissell MIPS16: High-density MIPS for the Embedded Market1 , 1997 .

[16]  Todd M. Austin,et al.  SimpleScalar: An Infrastructure for Computer System Modeling , 2002, Computer.

[17]  Geoffrey Brown,et al.  Lx: a technology platform for customizable VLIW embedded processing , 2000, ISCA '00.

[18]  Mahmut T. Kandemir,et al.  Leakage Current: Moore's Law Meets Static Power , 2003, Computer.

[19]  Gary S. Tyson,et al.  FITS: framework-based instruction-set tuning synthesis for embedded application specific processors , 2004, Proceedings. 41st Design Automation Conference, 2004..

[20]  A. Church Review: A. M. Turing, On Computable Numbers, with an Application to the Entscheidungsproblem , 1937 .

[21]  Trevor Mudge,et al.  Challenges for architectural level power modeling , 2002 .