Energy Efficiency Aspects of the AMD Zen 2 Architecture

In High Performance Computing, systems are evaluated based on their computational throughput. However, performance in contemporary server processors is primarily limited by power and thermal constraints. Ensuring operation within a given power envelope requires a wide range of sophisticated control mechanisms. While some of these are handled transparently by hardware control loops, others are controlled by the operating system. A lack of publicly disclosed implementation details further complicates this topic. However, understanding these mechanisms is a prerequisite for any effort to exploit the full computing capability and to minimize the energy consumption of today’s server systems. This paper highlights the various energy efficiency aspects of the AMD Zen 2 microarchitecture to facilitate system understanding and optimization. Key findings include qualitative and quantitative descriptions regarding core frequency transition delays, workload-based frequency limitations, effects of I/O die P-states on memory performance as well as discussion on the built-in power monitoring capabilities and its limitations. Moreover, we present specifics and caveats of idle states, wakeup times as well as the impact of idling and inactive hardware threads and cores on the performance of active resources such as other cores.

[1]  Hermann Härtig,et al.  Measuring energy consumption for short code paths using RAPL , 2012, PERV.

[2]  Andrea Bartolini,et al.  Application instrumentation for performance analysis and tuning with focus on energy efficiency , 2020, Concurr. Comput. Pract. Exp..

[3]  Teja Singh,et al.  2.1 Zen 2: The AMD 7nm Energy-Efficient High-Performance x86-64 Microprocessor Core , 2020, 2020 IEEE International Solid- State Circuits Conference - (ISSCC).

[4]  Bronis R. de Supinski,et al.  Adagio: making DVS practical for complex HPC applications , 2009, ICS.

[5]  Sean White,et al.  ‘Zeppelin’: An SoC for multichip architectures , 2018, 2018 IEEE International Solid - State Circuits Conference - (ISSCC).

[6]  Thomas Ilsche,et al.  The shift from processor power consumption to performance variations: fundamental implications at scale , 2016, Computer Science - Research and Development.

[7]  Daniel Gruss,et al.  PLATYPUS: Software-based Power Side-Channel Attacks on x86 , 2021, 2021 IEEE Symposium on Security and Privacy (SP).

[8]  Michael Werner,et al.  Wake-up latencies for processor idle states on current x86 processors , 2014, Computer Science - Research and Development.

[9]  Wolfgang E. Nagel,et al.  Power measurement techniques on standard compute nodes: A quantitative comparison , 2013, 2013 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).

[10]  Mahesh Subramony,et al.  The AMD “Zen 2” Processor , 2020, IEEE Micro.

[11]  William Jalby,et al.  Evaluation of CPU frequency transition latency , 2014, Computer Science - Research and Development.

[12]  Thomas Ilsche,et al.  Energy Efficiency Features of the Intel Skylake-SP Processor and Their Impact on Performance , 2019, 2019 International Conference on High Performance Computing & Simulation (HPCS).

[13]  David Harris,et al.  CMOS VLSI Design: A Circuits and Systems Perspective , 2004 .

[14]  Matthias S. Müller,et al.  Memory Performance and Cache Coherency Effects on an Intel Nehalem Multiprocessor System , 2009, 2009 18th International Conference on Parallel Architectures and Compilation Techniques.

[15]  Samuel Naffziger,et al.  2.2 AMD Chiplet Architecture for High-Performance Server and Desktop Products , 2020, 2020 IEEE International Solid- State Circuits Conference - (ISSCC).

[16]  Thomas Ilsche,et al.  An Energy Efficiency Feature Survey of the Intel Haswell Processor , 2015, 2015 IEEE International Parallel and Distributed Processing Symposium Workshop.

[17]  Thomas Ilsche,et al.  Software Controlled Clock Modulation for Energy Efficiency Optimization on Intel Processors , 2016, 2016 4th International Workshop on Energy Efficient Supercomputing (E2SC).

[18]  Thomas Ilsche,et al.  System Monitoring with lo2s: Power and Runtime Impact of C-State Transitions , 2018, 2018 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW).

[19]  Robert Schöne,et al.  FIRESTARTER 2: Dynamic Code Generation for Processor Stress Tests , 2021, 2021 IEEE International Conference on Cluster Computing (CLUSTER).

[20]  Corey Gough,et al.  Energy Efficient Servers: Blueprints for Data Center Optimization , 2015 .

[21]  Martin Schulz,et al.  Beyond DVFS: A First Look at Performance under a Hardware-Enforced Power Bound , 2012, 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum.