DyPO

Modern multiprocessor systems-on-chip (MpSoCs) offer tremendous power and performance optimization opportunities by tuning thousands of potential voltage, frequency and core configurations. As the workload phases change at runtime, different configurations may become optimal with respect to power, performance or other metrics. Identifying the optimal configuration at runtime is infeasible due to the large number of workloads and configurations. This paper proposes a novel methodology that can find the Pareto-optimal configurations at runtime as a function of the workload. To achieve this, we perform an extensive offline characterization to find classifiers that map performance counters to optimal configurations. Then, we use these classifiers and performance counters at runtime to choose Pareto-optimal configurations. We evaluate the proposed methodology by maximizing the performance per watt for 18 single- and multi-threaded applications. Our experiments demonstrate an average increase of 93%, 81% and 6% in performance per watt compared to the interactive, ondemand and powersave governors, respectively.

[1]  Arnaldo Carvalho de Melo,et al.  The New Linux ’ perf ’ Tools , 2010 .

[2]  Sherief Reda,et al.  Pack & Cap: Adaptive DVFS and thread packing under power caps , 2011, 2011 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[3]  Andreas Gerstlauer,et al.  Accurate phase-level cross-platform power and performance estimation , 2016, 2016 53nd ACM/EDAC/IEEE Design Automation Conference (DAC).

[4]  Umit Y. Ogras,et al.  Dynamic Power Budgeting for Mobile Systems Running Graphics Workloads , 2018, IEEE Transactions on Multi-Scale Computing Systems.

[5]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[6]  Alexandre Yakovlev,et al.  Power--Aware Performance Adaptation of Concurrent Applications in Heterogeneous Many-Core Systems , 2016, ISLPED.

[7]  Vikram S. Adve,et al.  LLVM: a compilation framework for lifelong program analysis & transformation , 2004, International Symposium on Code Generation and Optimization, 2004. CGO 2004..

[8]  Tajana Simunic,et al.  System-Level Power Management Using Online Learning , 2009, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[9]  Jian Li,et al.  Dynamic power-performance adaptation of parallel computation on chip multiprocessors , 2006, The Twelfth International Symposium on High-Performance Computer Architecture, 2006..

[10]  Luca Benini,et al.  A survey of design techniques for system-level dynamic power management , 2000, IEEE Trans. Very Large Scale Integr. Syst..

[11]  Maurizio Palesi,et al.  Multi-objective design space exploration using genetic algorithms , 2002, Proceedings of the Tenth International Symposium on Hardware/Software Codesign. CODES 2002 (IEEE Cat. No.02TH8627).

[12]  Narseo Vallina-Rodriguez,et al.  Energy Management Techniques in Modern Mobile Handsets , 2013, IEEE Communications Surveys & Tutorials.

[13]  Heba Khdr,et al.  Dark Silicon: From Computation to Communication , 2015, NOCS.

[14]  Marco D. Santambrogio,et al.  Workload-aware power optimization strategy for asymmetric multiprocessors , 2016, 2016 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[15]  Vijay Janapa Reddi,et al.  High-performance and energy-efficient mobile web browsing on big/little systems , 2013, 2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA).

[16]  Kai Li,et al.  The PARSEC benchmark suite: Characterization and architectural implications , 2008, 2008 International Conference on Parallel Architectures and Compilation Techniques (PACT).

[17]  Vanchinathan Venkataramani,et al.  Hierarchical power management for asymmetric multi-core in dark silicon era , 2013, 2013 50th ACM/EDAC/IEEE Design Automation Conference (DAC).

[18]  Trevor Hastie,et al.  An Introduction to Statistical Learning , 2013, Springer Texts in Statistics.

[19]  George Ho,et al.  PAPI: A Portable Interface to Hardware Performance Counters , 1999 .

[20]  Trevor Mudge,et al.  MiBench: A free, commercially representative embedded benchmark suite , 2001 .

[21]  Radu Marculescu,et al.  Modeling, Analysis and Optimization of Network-on-Chip Communication Architectures , 2013, Lecture Notes in Electrical Engineering.

[22]  Saturnino Garcia,et al.  CortexSuite: A synthetic brain benchmark suite , 2014, 2014 IEEE International Symposium on Workload Characterization (IISWC).

[23]  Radu Marculescu,et al.  An Optimal Control Approach to Power Management for Multi-Voltage and Frequency Islands Multiprocessor Platforms under Highly Variable Workloads , 2012, 2012 IEEE/ACM Sixth International Symposium on Networks-on-Chip.

[24]  Vittorio Zaccaria,et al.  Multi-objective design space exploration of embedded systems , 2003, J. Embed. Comput..

[25]  Baoxin Zhao,et al.  A pareto-optimal runtime power budgeting scheme for many-core systems , 2016, Microprocess. Microsystems.

[26]  Radu Marculescu,et al.  Wireless NoC and Dynamic VFI Codesign: Energy Efficiency Without Performance Penalty , 2016, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[27]  Sanjay Ranka,et al.  Dynamic Reconfiguration in Real-Time Systems , 2013 .

[28]  Nikil D. Dutt,et al.  SPARTA: Runtime task allocation for energy efficient heterogeneous manycores , 2016, 2016 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS).

[29]  Xi Chen,et al.  Dynamic voltage and frequency scaling for shared resources in multicore processor designs , 2013, 2013 50th ACM/EDAC/IEEE Design Automation Conference (DAC).

[30]  Tajana Simunic,et al.  Temperature Aware Task Scheduling in MPSoCs , 2007, 2007 Design, Automation & Test in Europe Conference & Exhibition.

[31]  Patrick Mochel The sysfs Filesystem , 2005 .

[32]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.

[33]  Diana Marculescu,et al.  Analysis of dynamic voltage/frequency scaling in chip-multiprocessors , 2007, Proceedings of the 2007 international symposium on Low power electronics and design (ISLPED '07).

[34]  Brad Calder,et al.  Discovering and Exploiting Program Phases , 2003, IEEE Micro.

[35]  Sanjay Ranka,et al.  Dynamic Reconfiguration in Real-Time Systems: Energy, Performance, and Thermal Perspectives , 2012 .

[36]  Margaret Martonosi,et al.  Live, Runtime Phase Monitoring and Prediction on Real Systems with Application to Dynamic Power Management , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).

[37]  Ümit Y. Ogras,et al.  Predictive dynamic thermal and power management for heterogeneous mobile platforms , 2015, 2015 Design, Automation & Test in Europe Conference & Exhibition (DATE).