Power breakdown analysis for a heterogeneous NoC platform running a video application

Users expect future handheld devices to provide extended multimedia functionality and have long battery life. This type of application imposes heavy constraints on performance and power consumption and forces designers to optimize all parts of their platform. Evaluating the overall platform power breakdown is therefore critical to determine where to spend the efforts on power optimization. Surprisingly, few studies exist on that topic and decisions generally rely on common belief. We have realized a complete power breakdown for a realistic platform to identify the major power bottlenecks. This paper presents this power assessment of a realistic heterogeneous network on chip platform including processors, network and data/instruction memory hierarchy, running a video processing chain from camera to display. Our power breakdown identifies the main bottlenecks in the memory hierarchy and the foreground memory, and shows that global interconnect is not that critical for a well-optimized application mapping.

[1]  Javier Zalamea,et al.  Two-level hierarchical register file organization for VLIW processors , 2000, MICRO 33.

[2]  Srivaths Ravi,et al.  A hybrid energy-estimation technique for extensible processors , 2004, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[3]  Luca Benini,et al.  Analysis of power consumption on switch fabrics in network routers , 2002, DAC '02.

[4]  Karam S. Chatha,et al.  A power and performance model for network-on-chip architectures , 2004, Proceedings Design, Automation and Test in Europe Conference and Exhibition.

[5]  Srivaths Ravi,et al.  Energy estimation for extensible processors , 2003, 2003 Design, Automation and Test in Europe Conference and Exhibition.

[6]  Rudy Lauwereins,et al.  CRISP: A Template for Reconfigurable Instruction Set Processors , 2001, FPL.

[7]  Alexander V. Veidenbaum,et al.  Energy aware register file implementation through instruction predecode , 2003, Proceedings IEEE International Conference on Application-Specific Systems, Architectures, and Processors. ASAP 2003.

[8]  David Blaauw,et al.  Drowsy caches: simple techniques for reducing leakage power , 2002, ISCA.

[9]  José Luis,et al.  Power estimation and power optimization policies for processor-based systems , 2005 .

[10]  David Blaauw,et al.  Circuit and microarchitectural techniques for reducing cache leakage power , 2004, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[11]  Anantha Chandrakasan,et al.  A bus energy model for deep submicron technology , 2002, IEEE Trans. Very Large Scale Integr. Syst..

[12]  L. Benini,et al.  A Power Modeling and Estimation Framework for VLIW-based Embedded Systems , 2001 .

[13]  Vincent John Mooney,et al.  A comparison of five different multiprocessor SoC bus architectures , 2001, Proceedings Euromicro Symposium on Digital Systems Design.

[14]  Ismail Kadayif,et al.  Instruction compression and encoding for low-power systems , 2002, 15th Annual IEEE International ASIC/SOC Conference.

[15]  H. De Man,et al.  System-level power exploration for MPEG-2 decoder on embedded cores: a systematic approach , 1997, 1997 IEEE Workshop on Signal Processing Systems. SiPS 97 Design and Implementation formerly VLSI Signal Processing.

[16]  Axel Jantsch,et al.  Architecture Exploration of Interconnection Networks as a Communication Layer for Reconfigurable Systems , 2003 .

[17]  Diederik Verkest,et al.  Design style case study for embedded multi media compute nodes , 2004, 25th IEEE International Real-Time Systems Symposium.

[18]  Jörg Henkel,et al.  A case study in networks-on-chip design for embedded video , 2004, Proceedings Design, Automation and Test in Europe Conference and Exhibition.

[19]  Kees G. W. Goossens,et al.  Networks on silicon: blessing or nightmare? , 2002, Proceedings Euromicro Symposium on Digital System Design. Architectures, Methods and Tools.

[20]  Miodrag Potkonjak,et al.  MediaBench: a tool for evaluating and synthesizing multimedia and communications systems , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.

[21]  Henk Corporaal,et al.  Clustered L0 Buffer Organization for Low Energy Embedded Processors , 2002 .

[22]  Luca Benini,et al.  Analyzing on-chip communication in a MPSoC environment , 2004, Proceedings Design, Automation and Test in Europe Conference and Exhibition.

[23]  B. Ramakrishna Rau,et al.  Iterative modulo scheduling: an algorithm for software pipelining loops , 1994, MICRO 27.

[24]  Peter Petrov,et al.  Application-specific instruction memory customizations for power-efficient embedded processors , 2003, IEEE Design & Test of Computers.