Distributed power management of real-time applications on a GALS multiprocessor SOC

It is generally desirable to reduce the power consumption of embedded systems. Dynamic Voltage and Frequency Scaling (DVFS) is a commonly applied technique to achieve power reduction at the cost of computational performance. Multiprocessor System on Chips (MPSoCs) can have multiple voltage and frequency domains, e.g. per-core. When DVFS is applied to real-time applications, the effects must be accounted for in the associated formal timing model. In this work, we contribute our distributed multi-core run-time power-management technique for real-time dataflow applications that uses per-core lookup-tables to select low-power DVFS operating points that meet the application's timing requirement. We describe in detail how timing slack is observed locally at run-time on each core and is used to select a local DVFS operating point that meets the application's timing requirement. We further describe our static off-line formal analysis technique to generate these per-core lookup-tables that link timing slack to low-power DVFS operating points. We provide an experimental analysis of our proposed technique using an H.263 decoder application that is mapped onto an FPGA prototyped hardware platform.

[1]  Luca Benini,et al.  A Feedback-Based Approach to DVFS in Data-Flow Applications , 2009, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[2]  Matthias Függer,et al.  HEX: scaling honeycombs is easier than scaling clock trees , 2013, J. Comput. Syst. Sci..

[3]  Sander Stuijk,et al.  Buffer Sizing for Rate-Optimal Single-Rate Data-Flow Scheduling Revisited , 2010, IEEE Transactions on Computers.

[4]  Sander Stuijk,et al.  Throughput-constrained DVFS for scenario-aware dataflow graphs , 2013, 2013 IEEE 19th Real-Time and Embedded Technology and Applications Symposium (RTAS).

[5]  Kees G. W. Goossens,et al.  Virtual execution platforms for mixed-time-criticality systems: the CompSOC architecture and design flow , 2013, SIGBED.

[6]  Kees G. W. Goossens,et al.  Dataflow formalisation of real-time streaming applications on a Composable and Predictable Multi-Processor SOC , 2015, J. Syst. Archit..

[7]  Sander Stuijk,et al.  Power Minimisation for Real-Time Dataflow Applications , 2011, 2011 14th Euromicro Conference on Digital System Design.

[8]  Sandy Irani,et al.  Algorithmic problems in power management , 2005, SIGA.

[9]  Prudence W. H. Wong,et al.  Energy Efficient Deadline Scheduling in Two Processor Systems , 2007, ISAAC.

[10]  Hal Wasserman,et al.  Comparing algorithm for dynamic speed-setting of a low-power CPU , 1995, MobiCom '95.

[11]  Ben H. H. Juurlink,et al.  Leakage-Aware Multiprocessor Scheduling , 2009, J. Signal Process. Syst..

[12]  Christoph Lenzen,et al.  Clock Synchronization with Bounded Global and Local Skew , 2008, 2008 49th Annual IEEE Symposium on Foundations of Computer Science.

[13]  P. J. De Langen,et al.  Energy reduction techniques for caches and multiprocessors , 2009 .

[14]  Christoph Lenzen,et al.  Byzantine Self-Stabilizing Clock Distribution with HEX: Implementation, Simulation, Clock Multiplication , 2013 .

[15]  Margaret Martonosi,et al.  Coordinated, distributed, formal energy management of chip multiprocessors , 2005, ISLPED '05. Proceedings of the 2005 International Symposium on Low Power Electronics and Design, 2005..

[16]  Luca Benini,et al.  A survey of design techniques for system-level dynamic power management , 2000, IEEE Trans. Very Large Scale Integr. Syst..

[17]  Thomas D. Burd,et al.  Energy efficient CMOS microprocessor design , 1995, Proceedings of the Twenty-Eighth Annual Hawaii International Conference on System Sciences.

[18]  Prudence W. H. Wong,et al.  Competitive non-migratory scheduling for flow time and energy , 2008, SPAA '08.

[19]  F. Frances Yao,et al.  A scheduling model for reduced CPU energy , 1995, Proceedings of IEEE 36th Annual Foundations of Computer Science.

[20]  Kirk Pruhs,et al.  Speed Scaling of Tasks with Precedence Constraints , 2005, WAOA.

[21]  Lothar Thiele,et al.  P-YDS algorithm: An optimal extension of YDS algorithm to minimize expected energy for real-time jobs , 2014, 2014 International Conference on Embedded Software (EMSOFT).

[22]  Stephen P. Boyd,et al.  Graph Implementations for Nonsmooth Convex Programs , 2008, Recent Advances in Learning and Control.

[23]  Lothar Thiele,et al.  Energy efficient DVFS scheduling for mixed-criticality systems , 2014, 2014 International Conference on Embedded Software (EMSOFT).

[24]  Susanne Albers,et al.  Speed Scaling on Parallel Processors , 2007, SPAA '07.

[25]  Erik D. Demaine,et al.  Energy-Efficient Algorithms , 2016, ITCS.