Reliability-Aware Runtime Power Management for Many-Core Systems in the Dark Silicon Era

Power management of networked many-core systems with runtime application mapping becomes more challenging in the dark silicon era. It necessitates considering network characteristics at runtime to achieve better performance while honoring the peak power upper bound. On the other hand, power management has a direct effect on chip temperature, which is the main driver of the aging effects. Therefore, alongside performance fulfillment, the controlling mechanism must also consider the current cores’ reliability in its actuator manipulation to enhance the overall system lifetime in the long term. In this paper, we propose a multiobjective dynamic power management technique that uses current power consumption and other network characteristics including the reliability of the cores as the feedback while utilizing fine-grained voltage and frequency scaling and per-core power gating as the actuators. In addition, disturbance rejecter and reliability balancer are designed to help the controller to better smooth power consumption in the short term and reliability in the long term, respectively. Simulations of dynamic workloads and mixed criticality application profiles show that our method not only is effective in honoring the power budget while considerably boosting the system throughput, but also increases the overall system lifetime by minimizing aging effects by means of power consumption balancing.

[1]  Kevin Kai-Wei Chang,et al.  HAT: Heterogeneous Adaptive Throttling for On-Chip Networks , 2012, 2012 IEEE 24th International Symposium on Computer Architecture and High Performance Computing.

[2]  Li Shang,et al.  System-level reliability modeling for MPSoCs , 2010, 2010 IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS).

[3]  Kevin Skadron,et al.  Dark vs. Dim Silicon and Near-Threshold Computing , 2013 .

[4]  Axel Jantsch,et al.  Scalability of network-on-chip communication architecture for 3-D meshes , 2009, 2009 3rd ACM/IEEE International Symposium on Networks-on-Chip.

[5]  Axel Jantsch,et al.  Dark silicon aware power management for manycore systems under dynamic workloads , 2014, 2014 IEEE 32nd International Conference on Computer Design (ICCD).

[6]  Yun Zhang,et al.  Revisiting the Sequential Programming Model for Multi-Core , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).

[7]  Jason Howard A 48-core IA-32 processor with on-die message-passing and DVFS in 45nm CMOS , 2010, 2010 IEEE Asian Solid-State Circuits Conference.

[8]  Margaret Martonosi,et al.  Exploring the Potential of CMP Core Count Management on Data Center Energy Savings , 2011 .

[9]  Kevin Skadron,et al.  Temperature-aware microarchitecture: Modeling and implementation , 2004, TACO.

[10]  Radu Marculescu,et al.  Dynamic power management for multidomain system-on-chip platforms , 2013, ACM Trans. Design Autom. Electr. Syst..

[11]  Marco Gribaudo,et al.  A lightweight and open-source framework for the lifetime estimation of multicore systems , 2014, 2014 IEEE 32nd International Conference on Computer Design (ICCD).

[12]  Sheldon X.-D. Tan,et al.  Lifetime optimization for real-time embedded systems considering electromigration effects , 2014, 2014 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[13]  Pasi Liljeberg,et al.  Smart hill climbing for agile dynamic mapping in many-core systems , 2013, 2013 50th ACM/EDAC/IEEE Design Automation Conference (DAC).

[14]  Gianluca Palermo,et al.  Voltage island management in near threshold manycore architectures to mitigate dark silicon , 2014, 2014 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[15]  Vanchinathan Venkataramani,et al.  Hierarchical power management for asymmetric multi-core in dark silicon era , 2013, 2013 50th ACM/EDAC/IEEE Design Automation Conference (DAC).

[16]  Luiz André Barroso,et al.  The Case for Energy-Proportional Computing , 2007, Computer.

[17]  Luca Benini,et al.  Dynamic variability management in mobile multicore processors under lifetime constraints , 2014, 2014 IEEE 32nd International Conference on Computer Design (ICCD).

[18]  Karthikeyan Sankaralingam,et al.  Dark Silicon and the End of Multicore Scaling , 2012, IEEE Micro.

[19]  Kevin Skadron,et al.  Implications of the Power Wall: Dim Cores and Reconfigurable Logic , 2013, IEEE Micro.

[20]  Radu Marculescu,et al.  Dynamic power management of voltage-frequency island partitioned Networks-on-Chip using Intel's Single-chip Cloud Computer , 2011, Proceedings of the Fifth ACM/IEEE International Symposium.

[21]  Salvatore Monteleone,et al.  Noxim: An open, extensible and cycle-accurate network on chip simulator , 2015, 2015 IEEE 26th International Conference on Application-specific Systems, Architectures and Processors (ASAP).

[22]  Kai Ma,et al.  PGCapping: Exploiting power gating for power capping and core lifetime balancing in CMPs , 2012, 2012 21st International Conference on Parallel Architectures and Compilation Techniques (PACT).

[23]  Jung Ho Ahn,et al.  McPAT: An integrated power, area, and timing modeling framework for multicore and manycore architectures , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[24]  Muhammad Shafique,et al.  Self-adaptive hybrid Dynamic Power Management for many-core systems , 2013, 2013 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[25]  Vikram Bhatt,et al.  The GreenDroid Mobile Application Processor: An Architecture for Silicon's Dark Future , 2011, IEEE Micro.

[26]  Axel Jantsch,et al.  Dynamic power management for many-core platforms in the dark silicon era: A multi-objective control approach , 2015, 2015 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED).

[27]  Heba Khdr,et al.  TSP: Thermal Safe Power - Efficient power budgeting for many-core systems in dark silicon , 2014, 2014 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS).

[28]  Muhammad Usman Karim Khan,et al.  Power-efficient accelerator allocation in adaptive dark silicon many-core systems , 2015, 2015 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[29]  Diana Marculescu,et al.  Distributed reinforcement learning for power limited many-core system performance optimization , 2015, 2015 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[30]  Muhammad Usman Karim Khan,et al.  Hierarchical power budgeting for Dark Silicon chips , 2015, 2015 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED).

[31]  Pasi Liljeberg,et al.  CoNA: Dynamic application mapping for congestion reduction in many-core systems , 2012, 2012 IEEE 30th International Conference on Computer Design (ICCD).

[32]  Shuguang Feng,et al.  Self-calibrating Online Wearout Detection , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).