Tangram: Integrated Control of Heterogeneous Computers

Resource control in heterogeneous computers built with subsystems from different vendors is challenging. There is a tension between the need to quickly generate local decisions in each subsystem and the desire to coordinate the different subsystems for global optimization. In practice, global coordination among subsystems is considered hard, and current commercial systems use centralized controllers. The result is high response time and high design cost due to lack of modularity. To control emerging heterogeneous computers effectively, we propose a new control framework called Tangram that is fast, globally coordinated, and modular. Tangram introduces a new formal controller that combines multiple engines for optimization and safety, and has a standard interface. Building the controller for a subsystem requires knowing only about that subsystem. As a heterogeneous computer is assembled, the controllers in the different subsystems are connected hierarchically, exchanging standard coordination signals. To demonstrate Tangram, we prototype it in a heterogeneous server that we assemble using components from multiple vendors. Compared to state-of-the-art control, Tangram reduces, on average, the execution time of heterogeneous applications by 31% and their energy-delay product by 39%.

[1]  Sudhakar Yalamanchili,et al.  Cooperative boosting: needy versus greedy power management , 2013, ISCA.

[2]  Petko H. Petkov,et al.  Robust control design with MATLAB , 2005 .

[3]  Sudhakar Yalamanchili,et al.  Temperature regulation in multicore processors using adjustable-gain integral controllers , 2015, 2015 IEEE Conference on Control Applications (CCA).

[4]  George Lindfield,et al.  Numerical Methods Using MATLAB , 1998 .

[5]  Edwin V. Bonilla,et al.  Dynamic microarchitectural adaptation using machine learning , 2013, ACM Trans. Archit. Code Optim..

[6]  Xiaodong Wang,et al.  ReBudget: Trading Off Efficiency vs. Fairness in Market-Based Multicore Resource Allocation via Runtime Budget Reassignment , 2016, ASPLOS.

[7]  Josep Torrellas,et al.  Yukta: Multilayer Resource Controllers to Maximize Efficiency , 2018, 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA).

[8]  Carole-Jean Wu,et al.  MCM-GPU: Multi-chip-module GPUs for continued performance scalability , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).

[9]  Carole-Jean Wu,et al.  STEAM: A Smart Temperature and Energy Aware Multicore Controller , 2014, TECS.

[10]  Thomas F. Wenisch,et al.  CoScale: Coordinating CPU and Memory System DVFS in Server Systems , 2012, 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture.

[11]  Xue Liu,et al.  Integrating Adaptive Components: An Emerging Challenge in Performance-Adaptive Systems and a Server Farm Case-Study , 2007, 28th IEEE International Real-Time Systems Symposium (RTSS 2007).

[12]  Sudhakar Yalamanchili,et al.  Harmonia: Balancing compute and memory power in high-performance GPUs , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).

[13]  Xiaodong Li,et al.  Performance directed energy management for main memory and disks , 2004, ASPLOS XI.

[14]  Efraim Rotem,et al.  Power-Management Architecture of the Intel Microarchitecture Code-Named Sandy Bridge , 2012, IEEE Micro.

[15]  Antonio J. Peña,et al.  Chai: Collaborative heterogeneous applications for integrated-architectures , 2017, 2017 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).

[16]  Sudhakar Yalamanchili,et al.  Power regulation in high performance multicore processors , 2017, 2017 IEEE 56th Annual Conference on Decision and Control (CDC).

[17]  Sehat Sutardja,et al.  1.2 The future of IC design innovation , 2015, 2015 IEEE International Solid-State Circuits Conference - (ISSCC) Digest of Technical Papers.

[18]  Amin Ansari,et al.  Using Multiple Input, Multiple Output Formal Control to Maximize Resource Efficiency in Architectures , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).

[19]  John H. Mathews,et al.  Using MATLAB as a programming language for numerical analysis , 1994 .

[20]  Sherief Reda,et al.  Pack & Cap: Adaptive DVFS and thread packing under power caps , 2011, 2011 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[21]  David H. Bailey,et al.  The Nas Parallel Benchmarks , 1991, Int. J. High Perform. Comput. Appl..

[22]  Sudhakar Yalamanchili,et al.  A power capping controller for multicore processors , 2012, 2012 American Control Conference (ACC).

[23]  Lizy Kurian John,et al.  AUDIT: Stress Testing the Automatic Way , 2012, 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture.

[24]  R. Mahajan,et al.  Embedded Multi-die Interconnect Bridge (EMIB) -- A High Density, High Bandwidth Packaging Interconnect , 2016, 2016 IEEE 66th Electronic Components and Technology Conference (ECTC).

[25]  Sudhakar Yalamanchili,et al.  Coordinated energy management in heterogeneous processors , 2013, 2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[26]  Vanchinathan Venkataramani,et al.  Hierarchical power management for asymmetric multi-core in dark silicon era , 2013, 2013 50th ACM/EDAC/IEEE Design Automation Conference (DAC).

[27]  Henry Hoffmann,et al.  Automated control of multiple software goals using multiple actuators , 2017, ESEC/SIGSOFT FSE.

[28]  Henry Hoffmann,et al.  CALOREE: Learning Control for Predictable Latency and Low Energy , 2018, ASPLOS.

[29]  Ian Postlethwaite,et al.  Multivariable Feedback Control: Analysis and Design , 1996 .

[30]  Henry Hoffmann,et al.  GRAPE: Minimizing energy for GPU applications with performance requirements , 2016, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[31]  Lennart Ljung,et al.  System Identification: Theory for the User , 1987 .

[32]  Pradip Bose,et al.  Crank it up or dial it down: Coordinated multiprocessor frequency and folding control , 2013, 2013 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[33]  Matthew Poremba,et al.  Design and Analysis of an APU for Exascale Computing , 2017, 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA).

[34]  Sean White,et al.  ‘Zeppelin’: An SoC for multichip architectures , 2018, 2018 IEEE International Solid - State Circuits Conference - (ISSCC).

[35]  Lieven Eeckhout,et al.  Chrysso: an integrated power manager for constrained many-core processors , 2015, Conf. Computing Frontiers.

[36]  Danny Weyns,et al.  Keep it SIMPLEX: satisfying multiple goals with guarantees in control-based self-adaptive systems , 2016, SIGSOFT FSE.

[37]  Samuel Naffziger,et al.  Adaptive Voltage Frequency Scaling Using Critical Path Accumulator Implemented in 28nm CPU , 2016, 2016 29th International Conference on VLSI Design and 2016 15th International Conference on Embedded Systems (VLSID).

[38]  Sriram Sankar,et al.  The need for speed and stability in data center power capping , 2012, 2012 International Green Computing Conference (IGCC).

[39]  David R. Kaeli,et al.  Airavat: Improving energy efficiency of heterogeneous applications , 2018, 2018 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[40]  Balaram Sinharoy,et al.  Advanced features in IBM POWER8 systems , 2015, IBM J. Res. Dev..

[41]  Margaret Martonosi,et al.  Formal online methods for voltage/frequency control in multiple clock domain microprocessors , 2004, ASPLOS XI.

[42]  Natalie D. Enright Jerger,et al.  Modular Routing Design for Chiplet-Based Systems , 2018, 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA).

[43]  Benjamin C. Lee,et al.  The Computational Sprinting Game , 2016, ASPLOS.

[44]  Margaret Martonosi,et al.  Coordinated, distributed, formal energy management of chip multiprocessors , 2005, ISLPED '05. Proceedings of the 2005 International Symposium on Low Power Electronics and Design, 2005..

[45]  UNDERSTANDING POWER MANAGEMENT AND PROCESSOR PERFORMANCE DETERMINISM , 2018 .

[46]  Mahmut T. Kandemir,et al.  CPM in CMPs: Coordinated Power Management in Chip-Multiprocessors , 2010, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis.

[47]  Stijn Eyerman,et al.  Maximizing Heterogeneous Processor Performance Under Power Constraints , 2016, ACM Trans. Archit. Code Optim..

[48]  Axel Jantsch,et al.  SPECTR: Formal Supervisory Control and Coordination for Many-core Systems Resource Management , 2018, ASPLOS.

[49]  Tecnología NASA Advanced Supercomputing Division , 2010 .

[50]  Kai Li,et al.  The PARSEC benchmark suite: Characterization and architectural implications , 2008, 2008 International Conference on Parallel Architectures and Compilation Techniques (PACT).

[51]  Jun Wang,et al.  Application-Specific Performance-Aware Energy Optimization on Android Mobile Devices , 2017, 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA).

[52]  Xiaodong Wang,et al.  XChange: A market-based approach to scalable dynamic multi-resource allocation in multicore architectures , 2015, 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA).

[53]  Henry Hoffmann,et al.  Adapt&Cap: Coordinating System- and Application-Level Adaptation for Power-Constrained Systems , 2016, IEEE Design & Test.

[54]  Kai Ma,et al.  Temperature-constrained power control for chip multiprocessors with online model estimation , 2009, ISCA '09.

[55]  G. Stein,et al.  Performance and robustness analysis for structured uncertainty , 1982, 1982 21st IEEE Conference on Decision and Control.

[56]  Indrani Paul,et al.  Dynamic GPGPU Power Management Using Adaptive Model Predictive Control , 2017, 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA).

[57]  Engin Ipek,et al.  Coordinated management of multiple interacting resources in chip multiprocessors: A machine learning approach , 2008, 2008 41st IEEE/ACM International Symposium on Microarchitecture.

[58]  Vanish Talwar,et al.  No "power" struggles: coordinated multi-level power management for the data center , 2008, ASPLOS.

[59]  Jonathan White,et al.  Carrizo: A High Performance, Energy Efficient 28 nm APU , 2016, IEEE Journal of Solid-State Circuits.

[60]  Margaret Martonosi,et al.  An Analysis of Efficient Multi-Core Global Power Management Policies: Maximizing Performance for a Given Power Budget , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).

[61]  Charles Lefurgy,et al.  Power/Performance Controlling Techniques in OpenPOWER , 2017, ISC Workshops.

[62]  Henry Hoffmann,et al.  Automated multi-objective control for self-adaptive software design , 2015, ESEC/SIGSOFT FSE.

[63]  Hao Wang,et al.  Workload and power budget partitioning for single-chip heterogeneous processors , 2012, 2012 21st International Conference on Parallel Architectures and Compilation Techniques (PACT).

[64]  Varghese George,et al.  Power management of the third generation intel core micro architecture formerly codenamed ivy bridge , 2012, 2012 IEEE Hot Chips 24 Symposium (HCS).

[65]  Li Shen,et al.  PPEP: Online Performance, Power, and Energy Prediction Framework and DVFS Space Exploration , 2014, 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture.