Mitigating the Impact of Variability on Chip-Multiprocessor Power and Performance

Chip-multiprocessors (CMPs) have emerged as a popular means of exploiting growing transistor budgets. However, the same technology scaling that increases the number of transistors on a single die also creates greater variability in their key power- and performance-determining characteristics. As the number of cores and amount of memory per die increase, individual core and cache tiles will become small enough that traditional sources of intra-die power and performance variations will result in tile-to-tile (T2T) variations. We start from low-level models of the phenomena involved and create models for how systematic within-die process variations, random within-die process variations, and thermal variations manifest themselves as T2T variations. Current commercial CMP designs are partitioned into fine-grained frequency islands (FIs) to allow per-core control of clock frequencies. We use our models to evaluate leveraging this partitioning to address T2T variations. Exploiting the FI partitioning improves performance by an average of 8.4% relative to the fully-synchronous baseline when both process and thermal variability are addressed simultaneously, highlighting the importance of an integrated approach. The FI design can also achieve performance 7.1% higher than the baseline at fixed power or draw 24.2% less power at equal performance.

[1]  Stamatis Vassiliadis,et al.  Parallel Computer Architecture , 2000, Euro-Par.

[2]  Margaret Martonosi,et al.  Power Efficiency for Variation-Tolerant Multicore Processors , 2006, ISLPED'06 Proceedings of the 2006 International Symposium on Low Power Electronics and Design.

[3]  G. Sohi,et al.  A static power model for architects , 2000, Proceedings 33rd Annual IEEE/ACM International Symposium on Microarchitecture. MICRO-33 2000.

[4]  Anoop Gupta,et al.  Parallel computer architecture - a hardware / software approach , 1998 .

[5]  Josep Torrellas,et al.  Uncorq: Unconstrained Snoop Request Delivery in Embedded-Ring Multiprocessors , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).

[6]  Yu Cao,et al.  Predictive Technology Model for Nano-CMOS Design Exploration , 2006, 2006 1st International Conference on Nano-Networks and Workshops.

[7]  Margaret Martonosi,et al.  Wattch: a framework for architectural-level power analysis and optimizations , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).

[8]  David M. Brooks,et al.  Mitigating the Impact of Process Variations on Processor Register Files and Execution Units , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).

[10]  Yu Cao,et al.  New Generation of Predictive Technology Model for Sub-45 nm Early Design Exploration , 2006, IEEE Transactions on Electron Devices.

[11]  Kevin Skadron,et al.  Temperature-aware microarchitecture , 2003, ISCA '03.

[12]  David H. Albonesi,et al.  Synergistic Temperature and Energy Management in GALS Processor Architectures , 2006, ISLPED'06 Proceedings of the 2006 International Symposium on Low Power Electronics and Design.

[13]  Mircea R. Stan,et al.  Active threshold compensation circuit for improved performance in cooled CMOS systems , 2001, ISCAS 2001. The 2001 IEEE International Symposium on Circuits and Systems (Cat. No.01CH37196).

[14]  R. Kumar,et al.  An Integrated Quad-Core Opteron Processor , 2007, 2007 IEEE International Solid-State Circuits Conference. Digest of Technical Papers.

[15]  Thomas F. Wenisch,et al.  SimFlex: a fast, accurate, flexible full-system simulation framework for performance evaluation of server architecture , 2004, PERV.

[16]  Dimitri Antoniadis,et al.  Impact of using adaptive body bias to compensate die-to-die Vt variation on within-die Vt variation , 1999, Proceedings. 1999 International Symposium on Low Power Electronics and Design (Cat. No.99TH8477).

[17]  Thomas F. Wenisch,et al.  Simulation sampling with live-points , 2006, 2006 IEEE International Symposium on Performance Analysis of Systems and Software.

[18]  Kevin Skadron,et al.  Toward an Architectural Treatment of Parameter Variations , 2005 .

[19]  Steven M. Nowick,et al.  Robust interfaces for mixed-timing systems with application to latency-insensitive protocols , 2001, Proceedings of the 38th Design Automation Conference (IEEE Cat. No.01CH37232).

[20]  K.A. Bowman,et al.  Maximum clock frequency distribution model with practical VLSI design considerations , 2004, 2004 International Conference on Integrated Circuit Design and Technology (IEEE Cat. No.04EX866).

[21]  James D. Meindl,et al.  Impact of die-to-die and within-die parameter fluctuations on the maximum clock frequency distribution for gigascale integration , 2002, IEEE J. Solid State Circuits.

[22]  S. Naffziger,et al.  A 90-nm variable frequency clock system for a power-managed itanium architecture processor , 2006, IEEE Journal of Solid-State Circuits.

[23]  Luiz André Barroso,et al.  Piranha: a scalable architecture based on single-chip multiprocessing , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).

[24]  Ravi Rajwar,et al.  The impact of performance asymmetry in emerging multicore architectures , 2005, 32nd International Symposium on Computer Architecture (ISCA'05).

[25]  Josep Torrellas,et al.  A Model for Timing Errors in Processors with Parameter Variation , 2007, 8th International Symposium on Quality Electronic Design (ISQED'07).

[26]  Thomas F. Wenisch,et al.  SimFlex: Statistical Sampling of Computer System Simulation , 2006, IEEE Micro.

[27]  Josep Torrellas,et al.  ReCycle:: pipeline adaptation to tolerate process variation , 2007, ISCA '07.

[28]  Kevin Skadron,et al.  Impact of Parameter Variations on Multi-Core Chips , 2006 .

[29]  Josep Torrellas,et al.  Mitigating Parameter Variation with Dynamic Fine-Grain Body Biasing , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).

[30]  Kevin Skadron,et al.  Impact of Process Variations on Multicore Performance Symmetry , 2007, 2007 Design, Automation & Test in Europe Conference & Exhibition.

[31]  CaoYu,et al.  Mapping Statistical Process Variations Toward Circuit Performance Variability , 2007 .

[32]  Vivek De,et al.  Adaptive body bias for reducing impacts of die-to-die and within-die parameter variations on microprocessor frequency and leakage , 2002, 2002 IEEE International Solid-State Circuits Conference. Digest of Technical Papers (Cat. No.02CH37315).

[33]  John Paul Shen,et al.  Scaling and characterizing database workloads: bridging the gap between research and practice , 2003, Proceedings. 36th Annual IEEE/ACM International Symposium on Microarchitecture, 2003. MICRO-36..

[34]  Emil Talpes,et al.  Variability and energy awareness: a microarchitecture-level perspective , 2005, Proceedings. 42nd Design Automation Conference, 2005..

[35]  Yu Cao,et al.  Mapping statistical process variations toward circuit performance variability: an analytical modeling approach , 2005, Proceedings. 42nd Design Automation Conference, 2005..

[36]  José González,et al.  Independent Front-end and Back-end Dynamic Voltage Scaling for a GALS Microarchitecture , 2006, ISLPED'06 Proceedings of the 2006 International Symposium on Low Power Electronics and Design.