Power Estimation and Optimization Methodologies for VLIW-Based Embedded Systems

Power Estimation Methods.- Background.- Instruction-Level Power Estimation for VLIW Processor Cores.- Software Power Estimation of the LX Core: A Case Study.- System-Level Power Estimation for the LX Architecture.- Microprocessor Abstraction Levels.- Power Optimization Methods.- Background.- A Micro-Architectural Optimization for Low Power.- A Design Space Exploration Methodology.- Conclusions and future work.

[1]  Norman P. Jouppi,et al.  CACTI: an enhanced cache access and cycle time model , 1996, IEEE J. Solid State Circuits.

[2]  Massoud Pedram,et al.  Power minimization in IC design: principles and applications , 1996, TODE.

[3]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[4]  Hiroshi Nakamura,et al.  Advanced processor design using hardware description language AIDL , 1997, Proceedings of ASP-DAC '97: Asia and South Pacific Design Automation Conference.

[5]  Luca Benini,et al.  Dynamic power management for nonstationary service requests , 1999, Design, Automation and Test in Europe Conference and Exhibition, 1999. Proceedings (Cat. No. PR00078).

[6]  Robert C. Bedichek Talisman: fast and accurate multicomputer simulation , 1995, SIGMETRICS '95/PERFORMANCE '95.

[7]  Michael D. Smith,et al.  Tracing with Pixie , 1991 .

[8]  Massoud Pedram,et al.  Statistical sampling and regression analysis for RT-Level power evaluation , 1996, Proceedings of International Conference on Computer Aided Design.

[9]  Mark Horowitz,et al.  IRSIM: An Incremental MOS Switch-Level Simulator , 1989, 26th ACM/IEEE Design Automation Conference.

[10]  David Keppel,et al.  Shade: a fast instruction-set simulator for execution profiling , 1994, SIGMETRICS.

[11]  Kaushik Roy,et al.  SYCLOP: synthesis of CMOS logic for low power applications , 1992, Proceedings 1992 IEEE International Conference on Computer Design: VLSI in Computers & Processors.

[12]  Chuck Siska,et al.  A processor description language supporting retargetable multi-pipeline DSP program development tools , 1998, Proceedings. 11th International Symposium on System Synthesis (Cat. No.98EX210).

[13]  Nikil D. Dutt,et al.  Low-power memory mapping through reducing address bus activity , 1999, IEEE Trans. Very Large Scale Integr. Syst..

[14]  K. Ghose,et al.  Analytical energy dissipation models for low power caches , 1997, Proceedings of 1997 International Symposium on Low Power Electronics and Design.

[15]  Luca Benini,et al.  System-level power optimization of special purpose applications: the Beach Solution , 1997, Proceedings of 1997 International Symposium on Low Power Electronics and Design.

[16]  Anoop Gupta,et al.  Complete computer system simulation: the SimOS approach , 1995, IEEE Parallel Distributed Technol. Syst. Appl..

[17]  Ibrahim N. Hajj,et al.  Architectural and compiler techniques for energy reduction in high-performance microprocessors , 2000, IEEE Trans. Very Large Scale Integr. Syst..

[18]  Dake Liu,et al.  Power consumption estimation in CMOS VLSI chips , 1994, IEEE J. Solid State Circuits.

[19]  David W. Wall,et al.  A practical system fljr intermodule code optimization at link-time , 1993 .

[20]  O. Koufopavlou,et al.  Short-circuit energy dissipation modeling for submicrometer CMOS gates , 2000 .

[21]  Jenq Kuen Lee,et al.  Compiler optimization on instruction scheduling for low power , 2000, ISSS '00.

[22]  Massoud Pedram,et al.  Logic extraction and factorization for low power , 1995, DAC '95.

[23]  Kaushik Roy,et al.  Low-Power CMOS VLSI Circuit Design , 2000 .

[24]  Carl Staelin,et al.  Idleness is Not Sloth , 1995, USENIX.

[25]  Niraj K. Jha,et al.  High-level software energy macro-modeling , 2001, Proceedings of the 38th Design Automation Conference (IEEE Cat. No.01CH37232).

[26]  Radu Marculescu,et al.  Composite sequence compaction for finite-state machines using block entropy and high-order Markov models , 1997, Proceedings of 1997 International Symposium on Low Power Electronics and Design.

[27]  Narayanan Vijaykrishnan,et al.  Instruction scheduling based on energy and performance constraints , 2000, Proceedings IEEE Computer Society Workshop on VLSI 2000. System Design for a System-on-Chip Era.

[28]  Sheldon B. Akers,et al.  Binary Decision Diagrams , 1978, IEEE Transactions on Computers.

[29]  Paul E. Landman,et al.  High-level power estimation , 1996, Proceedings of 1996 International Symposium on Low Power Electronics and Design.

[30]  Mark Horowitz,et al.  Energy dissipation in general purpose microprocessors , 1996, IEEE J. Solid State Circuits.

[31]  Mahmut T. Kandemir,et al.  Influence of compiler optimizations on system power , 2001, IEEE Trans. Very Large Scale Integr. Syst..

[32]  Mahmut T. Kandemir,et al.  Energy-driven integrated hardware-software optimizations using SimplePower , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).

[33]  Toshinori Sato,et al.  Evaluation of architecture-level power estimation for CMOS RISC processors , 1995, 1995 IEEE Symposium on Low Power Electronics. Digest of Technical Papers.

[34]  Victor V. Zyuban,et al.  The energy complexity of register files , 1998, Proceedings. 1998 International Symposium on Low Power Electronics and Design (IEEE Cat. No.98TH8379).

[35]  Chi-Ying Tsui,et al.  Low power architecture design and compilation techniques for high-performance processors , 1994, Proceedings of COMPCON '94.

[36]  Luciano Lavagno,et al.  Hardware-software co-design of embedded systems: the POLIS approach , 1997 .

[37]  Maria Freericks,et al.  The nml machine description formalism , 1991 .

[38]  Hugo De Man,et al.  Strategy for power-efficient design of parallel systems , 1999, IEEE Trans. Very Large Scale Integr. Syst..

[39]  Neil C. Wilhelm,et al.  Caching processor general registers , 1995, Proceedings of ICCD '95 International Conference on Computer Design. VLSI in Computers and Processors.

[40]  Milo M. K. Martin,et al.  Exploiting dead value information , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.

[41]  Jan M. Rabaey,et al.  Architectural power analysis: The dual bit type method , 1995, IEEE Trans. Very Large Scale Integr. Syst..

[42]  Chi-Ying Tsui,et al.  Saving power in the control path of embedded processors , 1994, IEEE Design & Test of Computers.

[43]  C. Chakrabarti,et al.  Energy-efficient code generation for DSP56000 family , 2000, ISLPED'00: Proceedings of the 2000 International Symposium on Low Power Electronics and Design (Cat. No.00TH8514).

[44]  Jan M. Rabaey,et al.  Early power exploration—a World Wide Web application , 1996, DAC '96.

[45]  Charles E. Leiserson,et al.  Retiming synchronous circuitry , 1988, Algorithmica.

[46]  Tomás Lang,et al.  Exploiting the locality of memory references to reduce the address bus energy , 1997, Proceedings of 1997 International Symposium on Low Power Electronics and Design.

[47]  Sharad Malik,et al.  Technology mapping for low power in logic synthesis , 1996, Integr..

[48]  Vittorio Zaccaria,et al.  A design framework to efficiently explore energy-delay tradeoffs , 2001, Ninth International Symposium on Hardware/Software Codesign. CODES 2001 (IEEE Cat. No.01TH8571).

[49]  Vittorio Zaccaria,et al.  An instruction-level energy model for embedded VLIW architectures , 2002, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[50]  Luca Benini,et al.  Automatic synthesis of gated clocks for power reduction in sequential circuits , 1994 .

[51]  Nikil D. Dutt,et al.  EXPRESSION: a language for architecture exploration through compiler/simulator retargetability , 1999, Design, Automation and Test in Europe Conference and Exhibition, 1999. Proceedings (Cat. No. PR00078).

[52]  Marios C. Papaefthymiou,et al.  Precomputation-based sequential logic optimization for low power , 1994, IEEE Trans. Very Large Scale Integr. Syst..

[53]  Chi-Ying Tsui,et al.  Technology Decomposition and Mapping Targeting Low Power Dissipation , 1993, 30th ACM/IEEE Design Automation Conference.

[54]  Thomas D. Burd,et al.  Energy efficient CMOS microprocessor design , 1995, Proceedings of the Twenty-Eighth Annual Hawaii International Conference on System Sciences.

[55]  David A. Patterson,et al.  Computer Architecture: A Quantitative Approach , 1969 .

[56]  James R. Bell,et al.  Threaded code , 1973, CACM.

[57]  Jörg Henkel,et al.  A framework for estimating and minimizing energy dissipation of embedded HW/SW systems , 2001 .

[58]  Constantinos E. Goutis,et al.  Code Transformations for Embedded Multimedia Applications: Impact on Power and Performance , 1998, ISCA 1998.

[59]  Chi-Ying Tsui,et al.  Exact and Approximate Methods for Calculating Signal and Transition Probabilities in FSMs , 1994, 31st Design Automation Conference.

[60]  B. Ramakrishna Rau,et al.  Optimization of Machine Descriptions for Efficient Use , 1996, International Journal of Parallel Programming.

[61]  Enrico Macii,et al.  Stream synthesis for efficient power simulation based on spectral transforms , 2001, IEEE Trans. Very Large Scale Integr. Syst..

[62]  Alvin M. Despain,et al.  Cache design trade-offs for power and performance optimization: a case study , 1995, ISLPED '95.

[63]  Manish Gupta,et al.  Power-Aware Microarchitecture: Design and Modeling Challenges for Next-Generation Microprocessors , 2000, IEEE Micro.

[64]  S. Devadas,et al.  ISDL: An Instruction Set Description Language For Retargetability , 1997, Proceedings of the 34th Design Automation Conference.

[65]  Luca Benini,et al.  Saving power by synthesizing gated clocks for sequential circuits , 1994, IEEE Design & Test of Computers.

[66]  Doug Burger,et al.  Evaluating Future Microprocessors: the SimpleScalar Tool Set , 1996 .

[67]  Luca Benini,et al.  Transformation and synthesis of FSMs for low-power gated-clock implementation , 1995, ISLPED '95.

[68]  James R. Larus,et al.  Wisconsin Wind Tunnel II: a fast, portable parallel architecture simulator , 2000, IEEE Concurr..

[69]  Sharad Malik,et al.  Instruction level power analysis and optimization of software , 1996, Proceedings of 9th International Conference on VLSI Design.

[70]  Jecel Mattos de Assumpccao Hardware/software codesign in neo smalltalk , 2003, OOPSLA '03.

[71]  Edward J. McCluskey,et al.  Probabilistic Treatment of General Combinational Networks , 1975, IEEE Transactions on Computers.

[72]  John P. Knight,et al.  Power-Profiler: Optimizing ASICs Power Consumption at the Behavioral Level , 1995, 32nd Design Automation Conference.

[73]  Kjell O. Jeppson,et al.  CMOS Circuit Speed and Buffer Optimization , 1987, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[74]  Mahmut T. Kandemir,et al.  The design and use of simplePower: a cycle-accurate energy estimation tool , 2000, Proceedings 37th Design Automation Conference.

[75]  Lehrstuhl Informatik Xii The MIMOLA Language Version 4.1 , 1994 .

[76]  Srinivasa Vemuru,et al.  Short-circuit power dissipation estimation for cmos logic gates , 1994 .

[77]  Jeffry T. Russell,et al.  Software power estimation and optimization for high performance, 32-bit embedded processors , 1998, Proceedings International Conference on Computer Design. VLSI in Computers and Processors (Cat. No.98CB36273).

[78]  R. Tjarnstrom Power dissipation estimate by switch level simulation (CMOS circuits) , 1989, IEEE International Symposium on Circuits and Systems,.

[79]  Shin Min Kang,et al.  Low-power state assignment for finite state machines , 1994 .

[80]  L. Benini,et al.  A Power Modeling and Estimation Framework for VLIW-based Embedded Systems , 2001 .

[81]  Luca Benini,et al.  State assignment for low power dissipation , 1995 .

[82]  Jochen A. G. Jess,et al.  System Level Hardware/Software Co-Design: An Industrial Approach , 1997 .

[83]  Robert Michael Owens,et al.  Analysis of power consumption in memory hierarchies , 1997, Proceedings of 1997 International Symposium on Low Power Electronics and Design.

[84]  Enrico Macii Sequential synthesis and optimization for low power , 1997 .

[85]  Amitabh Srivastava,et al.  Analysis Tools , 2019, Public Transportation Systems.

[86]  Anantha P. Chandrakasan,et al.  Low Power Digital CMOS Design , 1995 .

[87]  Robert J. Fowler,et al.  MINT: a front end for efficient simulation of shared-memory multiprocessors , 1994, Proceedings of International Workshop on Modeling, Analysis and Simulation of Computer and Telecommunication Systems.

[88]  Mircea R. Stan,et al.  Bus-invert coding for low-power I/O , 1995, IEEE Trans. Very Large Scale Integr. Syst..

[89]  L. Benini,et al.  Operating-system directed power reduction , 2000, ISLPED'00: Proceedings of the 2000 International Symposium on Low Power Electronics and Design (Cat. No.00TH8514).

[90]  Massoud Pedram,et al.  Register Allocation and Binding for Low Power , 1995, 32nd Design Automation Conference.

[91]  A. Despain,et al.  Low Power State Assignment Targeting Two- And Multi-level Logic Implementations , 1994, IEEE/ACM International Conference on Computer-Aided Design.

[92]  J. Patrick Brennan,et al.  Low power methodology and design techniques for processor design , 1998, Proceedings. 1998 International Symposium on Low Power Electronics and Design (IEEE Cat. No.98TH8379).

[93]  William H. Mangione-Smith,et al.  Filtering Memory References to Increase Energy Efficiency , 2000, IEEE Trans. Computers.

[94]  Narayanan Vijaykrishnan,et al.  VLIW scheduling for energy and performance , 2001, Proceedings IEEE Computer Society Workshop on VLSI 2001. Emerging Technologies for VLSI Systems.

[95]  Henk Corporaal Microprocessor architectures - from VLIW to TTA , 1997 .

[96]  Jack W. Davidson,et al.  Machine Descriptions to Build Tools for Embedded Systems , 1998, LCTES.

[97]  Gürhan Küçük,et al.  AccuPower: an accurate power estimation tool for superscalar microprocessors , 2002, Proceedings 2002 Design, Automation and Test in Europe Conference and Exhibition.

[98]  S. Gupta,et al.  Power Macromodeling For High Level Power Estimation , 1997, Proceedings of the 34th Design Automation Conference.

[99]  Sharad Malik,et al.  Power analysis and minimization techniques for embedded DSP software , 1997, IEEE Trans. Very Large Scale Integr. Syst..

[100]  Luca Benini,et al.  Automatic selection of instruction op-codes of low-power core processors , 1999 .

[101]  R. Canal,et al.  Very low power pipelines using significance compression , 2000, Proceedings 33rd Annual IEEE/ACM International Symposium on Microarchitecture. MICRO-33 2000.

[102]  Luca Benini,et al.  Glitch power minimization by selective gate freezing , 2000, IEEE Trans. Very Large Scale Integr. Syst..

[103]  Alois Knoll,et al.  Generation of hardware machine models from instruction set descriptions , 1993, Proceedings of IEEE Workshop on VLSI Signal Processing.

[104]  Rok Sosic,et al.  Dynascope: a tool for program directing , 1992, PLDI '92.

[105]  Edward A. Lee,et al.  Ptolemy: A Framework for Simulating and Prototyping Heterogenous Systems , 2001, Int. J. Comput. Simul..

[106]  Srilatha Manne,et al.  Power and performance tradeoffs using various caching strategies , 1998, Proceedings. 1998 International Symposium on Low Power Electronics and Design (IEEE Cat. No.98TH8379).

[107]  Raminder Singh Bajwa,et al.  Stage-skip pipeline: a low power processor architecture using a decoded instruction buffer , 1996, Proceedings / International Symposium on Low Power Electronics and Design.

[108]  Margaret Martonosi,et al.  Wattch: a framework for architectural-level power analysis and optimizations , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).

[109]  Luca Benini,et al.  Design for testability of gated-clock FSMs , 1996, Proceedings ED&TC European Design and Test Conference.

[110]  Fred Douglis,et al.  Adaptive Disk Spin-Down Policies for Mobile Computers , 1995, Comput. Syst..

[111]  Allen C.-H. Wu,et al.  A predictive system shutdown method for energy saving of event-driven computation , 1997, 1997 Proceedings of IEEE International Conference on Computer Aided Design (ICCAD).

[112]  Margaret Martonosi,et al.  Run-time power estimation in high performance microprocessors , 2001, ISLPED '01.

[113]  Kurt Keutzer DAGON: Technology Binding and Local Optimization by DAG Matching , 1987, DAC.

[114]  Heinrich Meyr,et al.  LISA-machine description language and generic machine model for HW/SW co-design , 1996, VLSI Signal Processing, IX.

[115]  Mary Jane Irwin,et al.  Techniques for low energy software , 1997, Proceedings of 1997 International Symposium on Low Power Electronics and Design.

[116]  G. De Micheli Extending CAD tools and techniques , 1993 .

[117]  Luca Benini,et al.  Increasing Energy Efficiency of Embedded Systems by Application-Specific Memory Hierarchy Generation , 2000, IEEE Des. Test Comput..

[118]  Daniel D. Gajski,et al.  High ― Level Synthesis: Introduction to Chip and System Design , 1992 .

[119]  Cristina Silvano,et al.  Power estimation of system-level buses for microprocessor-based architectures: a case study , 1999, Proceedings 1999 IEEE International Conference on Computer Design: VLSI in Computers and Processors (Cat. No.99CB37040).

[120]  Jack W. Davidson,et al.  A Formal Model for Procedure Calling Conventions , 1994 .

[121]  Hendrikus J. M. Veendrick,et al.  Short-circuit dissipation of static CMOS circuitry and its impact on the design of buffer circuits , 1984 .

[122]  Norman Ramsey,et al.  Specifying representations of machine instructions , 1997, TOPL.

[123]  Vittorio Zaccaria,et al.  Fast system-level exploration of memory architectures driven by energy-delay metrics , 2001, ISCAS 2001. The 2001 IEEE International Symposium on Circuits and Systems (Cat. No.01CH37196).

[124]  D. Sarta,et al.  A data dependent approach to instruction level power estimation , 1999, Proceedings IEEE Alessandro Volta Memorial Workshop on Low-Power Design.

[125]  Luca Benini,et al.  Policy optimization for dynamic power management , 1999, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[126]  Robert K. Brayton,et al.  MIS: A Multiple-Level Logic Optimization System , 1987, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[127]  Sarita V. Adve,et al.  RSIM: An Execution-Driven Simulator for ILP-Based Shared-Memory Multiprocessors and Uniprocessors , 1997 .

[128]  Luca Benini,et al.  A multilevel engine for fast power simulation of realistic inputstreams , 2000, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[129]  Sharad Malik,et al.  Power analysis of embedded software: a first step towards software power minimization , 1994, IEEE Trans. Very Large Scale Integr. Syst..

[130]  Diederik Verkest,et al.  Co-Design of DSP Systems , 1996 .

[131]  Jörg Henkel,et al.  Evaluating power consumption of parameterized cache and bus architectures in system-on-a-chip designs , 2001, IEEE Trans. Very Large Scale Integr. Syst..

[132]  Radu Marculescu,et al.  Probabilistic modeling of dependencies during switching activity analysis , 1998, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[133]  Luca Benini,et al.  System-level power optimization: techniques and tools , 1999, ISLPED '99.

[134]  Luca Benini,et al.  Address bus encoding techniques for system-level power optimization , 1998, Proceedings Design, Automation and Test in Europe.

[135]  Cristina Silvano,et al.  Power optimization of system-level address buses based on software profiling , 2000, Proceedings of the Eighth International Workshop on Hardware/Software Codesign. CODES 2000 (IEEE Cat. No.00TH8518).

[136]  Norman P. Jouppi,et al.  WRL Research Report 93/5: An Enhanced Access and Cycle Time Model for On-chip Caches , 1994 .

[137]  Vittorio Zaccaria,et al.  Power exploration for embedded VLIW architectures , 2000, IEEE/ACM International Conference on Computer Aided Design. ICCAD - 2000. IEEE/ACM Digest of Technical Papers (Cat. No.00CH37140).

[138]  Luca Benini,et al.  Asymptotic zero-transition activity encoding for address busses in low-power microprocessor-based systems , 1997, Proceedings Great Lakes Symposium on VLSI.

[139]  Hiroyuki Tomiyama,et al.  Architecture Description Languages for Systems-on-Chip Design , 1999 .

[140]  Ping-Wen Ong,et al.  Power-conscious software design-a framework for modeling software on hardware , 1994, Proceedings of 1994 IEEE Symposium on Low Power Electronics.

[141]  John H. Gennari,et al.  A survey of clustering methods , 1989 .

[142]  Vittorio Zaccaria,et al.  Instruction-level power estimation for embedded VLIW cores , 2000, Proceedings of the Eighth International Workshop on Hardware/Software Codesign. CODES 2000 (IEEE Cat. No.00TH8518).

[143]  Enrico Macii,et al.  Parameterized RTL power models for soft macros , 2001, IEEE Trans. Very Large Scale Integr. Syst..

[144]  Walter Stechele,et al.  Novel modeling techniques for RTL power estimation , 2002, ISLPED '02.

[145]  John C. Gyllenhaal,et al.  A Machine Description Language For Compilation , 1994 .

[146]  Massoud Pedram,et al.  Cycle-accurate macro-models for RT-level power analysis , 1998, IEEE Trans. Very Large Scale Integr. Syst..

[147]  Jan M. Rabaey,et al.  Activity-sensitive architectural power analysis , 1996, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[148]  D.J. Rose,et al.  CAzM: A circuit analyzer with macromodeling , 1983, IEEE Transactions on Electron Devices.

[149]  Anantha P. Chandrakasan,et al.  Minimizing power consumption in digital CMOS circuits , 1995, Proc. IEEE.

[150]  Luca Benini,et al.  A survey of design techniques for system-level dynamic power management , 2000, IEEE Trans. Very Large Scale Integr. Syst..

[151]  Srinivas Devadas,et al.  Instruction selection, resource allocation, and scheduling in the AVIV retargetable code generator , 1998, Proceedings 1998 Design and Automation Conference. 35th DAC. (Cat. No.98CH36175).

[152]  Margaret Martonosi,et al.  Power-Performance Modeling and Tradeoff Analysis for a High End Microprocessor , 2000, PACS.

[153]  Anna R. Karlin,et al.  Competitive randomized algorithms for non-uniform problems , 1990, SODA '90.

[154]  K. Roy,et al.  Estimation Of Circuit Activity Considering Signal Correlations And Simultaneous Switching , 1994, IEEE/ACM International Conference on Computer-Aided Design.

[155]  Akhilesh Tyagi,et al.  Low power FSM design using Huffman-style encoding , 1997, Proceedings European Design and Test Conference. ED & TC 97.

[156]  Luca Benini,et al.  Power optimization of core-based systems by address bus encoding , 1998, IEEE Trans. Very Large Scale Integr. Syst..