Exploring Functional Unit Design Space of VLIW Processors for Optimizing Both Performance and Energy Consumption

The number of functional units can have significant impact on both the performance and energy consumption of VLIW processors. This paper uses a design exploration approach to find optimal integer functional unit configurations for achieving the best EDP (energy delay product) results for different media applications. Our experimental results quantitatively indicate that the optimal number of integer functional units should match the instruction level parallelism that can be extracted from the applications to balance both performance and energy optimally for VLIW processors.

[1]  Paolo Faraboschi,et al.  Embedded Computing: A VLIW Approach to Architecture, Compilers and Tools , 2004 .

[2]  Wayne H. Wolf,et al.  Data-path synthesis of VLIW video signal processors , 1998, Proceedings. 11th International Symposium on System Synthesis (Cat. No.98EX210).

[3]  Eby G. Friedman,et al.  Managing static leakage energy in microprocessor functional units , 2002, MICRO.

[4]  G. D. La Hei,et al.  TriMedia CPU64 design space exploration , 1999, Proceedings 1999 IEEE International Conference on Computer Design: VLSI in Computers and Processors (Cat. No.99CB37040).

[5]  Jeffry T. Russell,et al.  Software power estimation and optimization for high performance, 32-bit embedded processors , 1998, Proceedings International Conference on Computer Design. VLSI in Computers and Processors (Cat. No.98CB36273).

[6]  Santosh Pande,et al.  Optimizing Static Power Dissipation by Functional Units in Superscalar Processors , 2002, CC.

[7]  Vincenzo Catania,et al.  Multiobjective optimization of a parameterized VLIW architecture , 2004, Proceedings. 2004 NASA/DoD Conference on Evolvable Hardware, 2004..

[8]  B. Ramakrishna Rau,et al.  Iterative modulo scheduling: an algorithm for software pipelining loops , 1994, MICRO 27.

[9]  Wei Zhang,et al.  Compiler support for reducing leakage energy consumption , 2003, 2003 Design, Automation and Test in Europe Conference and Exhibition.

[10]  Paolo Faraboschi,et al.  Custom-fit processors: letting applications define architectures , 1996, Proceedings of the 29th Annual IEEE/ACM International Symposium on Microarchitecture. MICRO 29.

[11]  Miodrag Potkonjak,et al.  MediaBench: a tool for evaluating and synthesizing multimedia and communications systems , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.

[12]  Margaret Martonosi,et al.  Cache decay: exploiting generational behavior to reduce cache leakage power , 2001, ISCA 2001.