Some Limits of Power Delivery in the Multicore Era

The ability to scale down threshold and hence supply voltages can no longer keep up with device density as technology scales. Microprocessor power density is therefore increasing. At the same time, the total number of C4s is predicted to be constant for the foreseeable future, according to ITRS 2011. As a result, more and more of the C4 pads are dedicated to power delivery, at the expense of off-chip I/O signals, impeding I/O throughput scaling–even though core counts and hence bandwidth requirements are increasing exponentially. It therefore becomes important to consider the power delivery network (PDN) as early as possible in the design process, both to ensure enough I/O pads and because a later redesign due to power delivery issues is costly. In this paper, we propose and validate a steady-state architecture-level PDN model, called VoltSpot, and explore the impact of the power delivery constraint for future technology nodes. Our results, based on a scaled multicore processor, indicate that worstcase on-chip IR drop at 16nm will be at least three times larger than that at 45nm. We propose a first-order optimization algorithm to derive the number and placement of C4 pads for by power delivery to achieve a specific IR-drop target. When optimizing to satisfy an IR-drop constraint of 5%, power delivery requires so many pads that multicore processors at 16nm will not be able to maintain constant per-core I/O bandwidth.

[1]  Charlie Chung-Ping Chen,et al.  Efficient large-scale power grid analysis based on preconditioned Krylov-subspace iterative methods , 2001, Proceedings of the 38th Design Automation Conference (IEEE Cat. No.01CH37232).

[2]  Kevin Skadron,et al.  Temperature-aware microarchitecture , 2003, ISCA '03.

[3]  David Harris,et al.  CMOS VLSI Design: A Circuits and Systems Perspective , 2004 .

[4]  King-Ning Tu,et al.  Threshold current density of electromigration in eutectic SnPb solder , 2005 .

[5]  Martin D. F. Wong,et al.  IR drop and ground bounce awareness timing model , 2005, IEEE Computer Society Annual Symposium on VLSI: New Frontiers in VLSI Design (ISVLSI'05).

[6]  Luu T. Nguyen,et al.  Mean Time To Failure in Wafer Level-CSP Packages with SnPb and SnAgCu Solder Bumps , 2005 .

[7]  Ken Smits,et al.  Penryn: 45-nm next generation Intel® core™ 2 processor , 2007, 2007 IEEE Asian Solid-State Circuits Conference.

[8]  Meeta Sharma Gupta,et al.  Understanding Voltage Variations in Chip Multiprocessors using a Distributed Power-Delivery Network , 2007, 2007 Design, Automation & Test in Europe Conference & Exhibition.

[9]  Sani R. Nassif,et al.  Power grid analysis benchmarks , 2008, 2008 Asia and South Pacific Design Automation Conference.

[10]  Lieven Eeckhout,et al.  Automated microprocessor stressmark generation , 2008, 2008 IEEE 14th International Symposium on High Performance Computer Architecture.

[11]  Jung Ho Ahn,et al.  McPAT: An integrated power, area, and timing modeling framework for multicore and manycore architectures , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[12]  Uming Ko,et al.  SmartReflex Power and Performance Management Technologies for 90 nm, 65 nm, and 45 nm Mobile Application Processors , 2010, Proceedings of the IEEE.

[13]  Hsien-Hsin S. Lee,et al.  Integrated microarchitectural floorplanning and run-time controller for inductive noise mitigation , 2011, TODE.