SOSA: Self-Optimizing Learning with Self-Adaptive Control for Hierarchical System-on-Chip Management

Resource management strategies for many-core systems dictate the sharing of resources among applications such as power, processing cores, and memory bandwidth in order to achieve system goals. System goals require consideration of both system constraints (e.g., power envelope) and user demands (e.g., response time, energy-efficiency). Existing approaches use heuristics, control theory, and machine learning for resource management. They all depend on static system models, requiring a priori knowledge of system dynamics, and are therefore too rigid to adapt to emerging workloads or changing system dynamics. We present SOSA, a cross-layer hardware/software hierarchical resource manager. Low-level controllers optimize knob configurations to meet potentially conflicting objectives (e.g., maximize throughput and minimize energy). SOSA accomplishes this for many-core systems and unpredictable dynamic workloads by using rule-based reinforcement learning to build subsystem models from scratch at runtime. SOSA employs a high-level supervisor to respond to changing system goals due to operating condition, e.g., switch from maximizing performance to minimizing power due to a thermal event. SOSA's supervisor translates the system goal into low-level objectives (e.g., core instructions-per-second (IPS)) in order to control subsystems by coordinating numerous knobs (e.g., core operating frequency, task distribution) towards achieving the goal. The software supervisor allows for flexibility, while the hardware learners allow quick and efficient optimization. We evaluate a simulation-based implementation of SOSA and demonstrate SOSA's ability to manage multiple interacting resources in the presence of conflicting objectives, its efficiency in configuring knobs, and adaptability in the face of unpredictable workloads. Executing a combination of machine-learning kernels and microbenchmarks on a multicore system-on-a-chip, SOSA achieves target performance with less than 1% error starting with an untrained model, maintains the performance in the face of workload disturbance, and automatically adapts to changing constraints at runtime. We also demonstrate the resource manager with a hardware implementation on an FPGA.

[1]  A. Stephen Morse,et al.  Control Using Logic-Based Switching , 1997 .

[2]  Kai Ma,et al.  Scalable power control for many-core architectures running multi-threaded applications , 2011, 2011 38th Annual International Symposium on Computer Architecture (ISCA).

[3]  Christine A. Shoemaker,et al.  Flicker: a dynamically adaptive architecture for power limited multicore systems , 2013, ISCA.

[4]  Ümit Y. Ogras,et al.  Adaptive performance prediction for integrated GPUs , 2016, 2016 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[5]  Diana Marculescu,et al.  Distributed reinforcement learning for power limited many-core system performance optimization , 2015, 2015 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[6]  David Atienza,et al.  MAMUT: Multi-Agent Reinforcement Learning for Efficient Real-Time Multi-User Video Transcoding , 2019, 2019 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[7]  Axel Jantsch,et al.  Reliability-Aware Runtime Power Management for Many-Core Systems in the Dark Silicon Era , 2017, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[8]  Axel Jantsch,et al.  SPECTR: Formal Supervisory Control and Coordination for Many-core Systems Resource Management , 2018, ASPLOS.

[9]  Christian Bienia,et al.  Benchmarking modern multiprocessors , 2011 .

[10]  Engin Ipek,et al.  Coordinated management of multiple interacting resources in chip multiprocessors: A machine learning approach , 2008, 2008 41st IEEE/ACM International Symposium on Microarchitecture.

[11]  Vanish Talwar,et al.  No "power" struggles: coordinated multi-level power management for the data center , 2008, ASPLOS.

[12]  Stewart W. Wilson ZCS: A Zeroth Level Classifier System , 1994, Evolutionary Computation.

[13]  Chris Fallin,et al.  Memory power management via dynamic voltage/frequency scaling , 2011, ICAC '11.

[14]  Ada Gavrilovska,et al.  A case for coordinated resource management in heterogeneous multicore platforms , 2010, ISCA'10.

[15]  Henry Hoffmann,et al.  Dynamic knobs for responsive power-aware computing , 2011, ASPLOS XVI.

[16]  Josep Torrellas,et al.  Yukta: Multilayer Resource Controllers to Maximize Efficiency , 2018, 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA).

[17]  Jana Kosecka,et al.  Control of Discrete Event Systems , 1992 .

[18]  Axel Jantsch,et al.  Dynamic power management for many-core platforms in the dark silicon era: A multi-objective control approach , 2015, 2015 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED).

[19]  Henry Hoffmann,et al.  A generalized software framework for accurate and efficient management of performance goals , 2013, 2013 Proceedings of the International Conference on Embedded Software (EMSOFT).

[20]  Henry Hoffmann,et al.  CALOREE: Learning Control for Predictable Latency and Low Energy , 2018, ASPLOS.

[21]  S. Lafortune Supervisory Control Of Discrete Event Systems , 2011 .

[22]  Edwin V. Bonilla,et al.  Dynamic microarchitectural adaptation using machine learning , 2013, ACM Trans. Archit. Code Optim..

[23]  Margaret Martonosi,et al.  An Analysis of Efficient Multi-Core Global Power Management Policies: Maximizing Performance for a Given Power Budget , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).

[24]  Walter Stechele,et al.  Learning Classifier Tables for Autonomic Systems on Chip , 2008, GI Jahrestagung.

[25]  Walter Stechele,et al.  A low-overhead monitoring ring interconnect for MPSoC parameter optimization , 2012, 2012 IEEE 15th International Symposium on Design and Diagnostics of Electronic Circuits & Systems (DDECS).

[26]  Massoud Pedram,et al.  Stochastic modeling of a thermally-managed multi-core system , 2008, 2008 45th ACM/IEEE Design Automation Conference.

[27]  Amin Ansari,et al.  Using Multiple Input, Multiple Output Formal Control to Maximize Resource Efficiency in Architectures , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).

[28]  Michael F. P. O'Boyle,et al.  A Predictive Model for Dynamic Microarchitectural Adaptivity Control , 2010, 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture.

[29]  Carole-Jean Wu,et al.  STEAM: A Smart Temperature and Energy Aware Multicore Controller , 2014, TECS.

[30]  Thomas F. Wenisch,et al.  CoScale: Coordinating CPU and Memory System DVFS in Server Systems , 2012, 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture.

[31]  João P. Hespanha Tutorial on Supervisory Control , 2001 .

[32]  Nikil Dutt,et al.  HESSLE-FREE , 2019, ACM Trans. Embed. Comput. Syst..

[33]  Nikil D. Dutt,et al.  SPARTA: Runtime task allocation for energy efficient heterogeneous manycores , 2016, 2016 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS).

[34]  Pradip Bose,et al.  Crank it up or dial it down: Coordinated multiprocessor frequency and folding control , 2013, 2013 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[35]  Michael G. Safonov,et al.  Focusing on the Knowable Controller Invalidation and Learning , 1999 .

[36]  Sherief Reda,et al.  Pack & Cap: Adaptive DVFS and thread packing under power caps , 2011, 2011 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[37]  O. Mutlu,et al.  Fairness via source throttling: a configurable and high-performance fairness substrate for multi-core memory systems , 2010, ASPLOS XV.

[38]  Nikil Dutt,et al.  Adaptive-reflective middleware for power and energy management in many-core heterogeneous systems , 2019 .

[39]  Axel Jantsch,et al.  Design Methodology for Responsive and Rrobust MIMO Control of Heterogeneous Multicores , 2018, IEEE Transactions on Multi-Scale Computing Systems.

[40]  Stewart W. Wilson Classifier Fitness Based on Accuracy , 1995, Evolutionary Computation.

[41]  Raghavendra Pradyumna Pothukuchi,et al.  A Guide to Design MIMO Controllers for Architectures , 2016 .

[42]  Henry Hoffmann,et al.  Controlling software applications via resource allocation within the heartbeats framework , 2010, 49th IEEE Conference on Decision and Control (CDC).

[43]  James E. Smith,et al.  Managing multi-configuration hardware via dynamic working set analysis , 2002, ISCA.

[44]  Nikil D. Dutt,et al.  Gain scheduled control for nonlinear power management in CMPs , 2018, 2018 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[45]  Axel Jantsch,et al.  Goal-Driven Autonomy for Efficient On-chip Resource Management: Transforming Objectives to Goals , 2019, 2019 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[46]  Daniel Sánchez,et al.  Maximizing Cache Performance Under Uncertainty , 2017, 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA).

[47]  Henry Hoffmann,et al.  Maximizing Performance Under a Power Cap: A Comparison of Hardware, Software, and Hybrid Techniques , 2016, ASPLOS.

[48]  Nikil D. Dutt,et al.  Dependability evaluation of SISO control-theoretic power managers for processor architectures , 2017, 2017 IEEE Nordic Circuits and Systems Conference (NORCAS): NORCHIP and International Symposium of System-on-Chip (SoC).

[49]  Donald Yeung,et al.  Learning-Based SMT Processor Resource Distribution via Hill-Climbing , 2006, 33rd International Symposium on Computer Architecture (ISCA'06).

[50]  Nikil D. Dutt,et al.  On the feasibility of SISO control-theoretic DVFS for power capping in CMPs , 2018, Microprocess. Microsystems.

[51]  Daniel Sánchez,et al.  Rubik: Fast analytical power management for latency-critical systems , 2015, 2015 48th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).