Low-Cost Per-Core Voltage Domain Support for Power-Constrained High-Performance Processors

Per-core voltage domains can improve performance under a power constraint. Most commercial processors, however, only have a single voltage domain for all processor cores. This is because splitting the single voltage domain into per-core voltage domains and powering them with multiple off-chip voltage regulators (VRs) incur a high cost for the platform and package designs. Although using on-chip switching VRs can be an alternative solution, integrating high-quality inductors for VRs with cores has been a technical challenge. In this paper, we propose a cost-effective power delivery technique to support per-core voltage domains. Our technique is based on the observations that: 1) core-to-core (C2C) voltage variations are relatively small for most execution intervals when the voltages/frequencies are optimized to maximize performance under a power constraint and 2) per-core power-gating devices augmented with feedback control circuitry can serve as low-cost VRs that can provide high efficiency in situations like 1). Our experimental results show that processors using our technique can achieve power efficiency as high as those using the per-core on-chip switching VRs at a much lower cost.

[1]  J. Torrellas,et al.  VARIUS: A Model of Process Variation and Resulting Timing Errors for Microarchitects , 2008, IEEE Transactions on Semiconductor Manufacturing.

[2]  Wei Fu,et al.  A feasibility study of high-frequency buck regulators in nanometer CMOS technologies , 2009, 2009 IEEE Dallas Circuits and Systems Workshop (DCAS).

[3]  Chih-Kong Ken Yang,et al.  Evaluation of Fully-Integrated Switching Regulators for CMOS Process Technologies , 2007, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[4]  Zhiyi Yu,et al.  A 167-Processor Computational Platform in 65 nm CMOS , 2009, IEEE Journal of Solid-State Circuits.

[5]  Jung Ho Ahn,et al.  McPAT: An integrated power, area, and timing modeling framework for multicore and manycore architectures , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[6]  Sriram R. Vangal,et al.  A 5-GHz Mesh Interconnect for a Teraflops Processor , 2007, IEEE Micro.

[7]  Jonathan Chang,et al.  A 45 nm 8-Core Enterprise Xeon¯ Processor , 2010, IEEE J. Solid State Circuits.

[8]  Nam Sung Kim,et al.  Cost-effective power delivery to support per-core voltage domains for power-constrained processors , 2012, DAC Design Automation Conference 2012.

[9]  S. Rajapandian,et al.  High Voltage Tolerant Linear Regulator With Fast Digital Control for Biasing of Integrated DC-DC Converters , 2007, IEEE Journal of Solid-State Circuits.

[10]  Meeta Sharma Gupta,et al.  System level analysis of fast, per-core DVFS using on-chip switching regulators , 2008, 2008 IEEE 14th International Symposium on High Performance Computer Architecture.

[11]  Josep Torrellas,et al.  Variation-Aware Application Scheduling and Power Management for Chip Multiprocessors , 2008, 2008 International Symposium on Computer Architecture.

[12]  Min Xu,et al.  Evaluating Non-deterministic Multi-threaded Commercial Workloads , 2001 .

[13]  Stijn Eyerman,et al.  Fine-grained DVFS using on-chip regulators , 2011, TACO.

[14]  Samuel Naffziger,et al.  An x86-64 Core in 32 nm SOI CMOS , 2011, IEEE Journal of Solid-State Circuits.

[15]  Milo M. K. Martin,et al.  Multifacet's general execution-driven multiprocessor simulator (GEMS) toolset , 2005, CARN.

[16]  Saurabh Dighe,et al.  Within-Die Variation-Aware Dynamic-Voltage-Frequency-Scaling With Optimal Core Allocation and Thread Hopping for the 80-Core TeraFLOPS Processor , 2011, IEEE Journal of Solid-State Circuits.

[17]  Nam Sung Kim,et al.  Frequency and yield optimization using power gates in power-constrained designs , 2009, ISLPED.

[18]  T. Karnik,et al.  Area-efficient linear regulator with ultra-fast load regulation , 2005, IEEE Journal of Solid-State Circuits.

[19]  Diana Marculescu,et al.  Variation-aware dynamic voltage/frequency scaling , 2009, 2009 IEEE 15th International Symposium on High Performance Computer Architecture.

[20]  Naehyuck Chang,et al.  Accurate modeling and calculation of delay and energy overheads of dynamic voltage scaling in modern high-performance microprocessors , 2010, 2010 ACM/IEEE International Symposium on Low-Power Electronics and Design (ISLPED).

[21]  Srikanth Balasubramanian Power delivery for high performance microprocessors , 2008, Proceeding of the 13th international symposium on Low power electronics and design (ISLPED '08).

[22]  Masud H. Chowdhury,et al.  A Hybrid Scheme for On-Chip Voltage Regulation in System-On-a-Chip (SOC) , 2011, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[23]  Jian Li,et al.  Dynamic power-performance adaptation of parallel computation on chip multiprocessors , 2006, The Twelfth International Symposium on High-Performance Computer Architecture, 2006..

[24]  S. Narendra,et al.  A 233-MHz 80%-87% efficient four-phase DC-DC converter utilizing air-core inductors on package , 2005, IEEE Journal of Solid-State Circuits.

[25]  Manish Gupta,et al.  Power-Aware Microarchitecture: Design and Modeling Challenges for Next-Generation Microprocessors , 2000, IEEE Micro.

[26]  Gu-Yeon Wei,et al.  Thread motion: fine-grained power management for multi-core systems , 2009, ISCA '09.