System level performance analysis and optimization for the adaptive clocking based multi-core processor

A supply voltage droop, temperature variation and aging effects can generate timing failures during operation. Various adaptive clocking methods have been introduced to resolve the problems. They use a tunable clock to avoid the timing failures rather than using wide design guard bands. However, the system performance analysis becomes complicated in a multi-core system with the adaptive clocking method. In this paper, a queueing theory based system level performance model is proposed to estimate an average response time and power by a closed form equation. Furthermore, for multi-core system with the adaptive clocking, an optimal job scheduling method using the inequality of arithmetic and geometric means is proposed. The proposed optimal job scheduling method relieves a system performance degradation arising from the adaptive clocking. The proposed performance model can analyze the system level performance within ∼3% error compared with a JMT system simulation tool. Experimental results also show that the proposed job scheduling method can obtain a significant performance enhancement than the conventional round-robin method.

[1]  Stephen J. Wright,et al.  Primal-Dual Interior-Point Methods , 1997 .

[2]  David M. Bull,et al.  RazorII: In Situ Error Detection and Correction for PVT and SER Tolerance , 2009, IEEE Journal of Solid-State Circuits.

[3]  David Blaauw,et al.  Razor II: In Situ Error Detection and Correction for PVT and SER Tolerance , 2008, 2008 IEEE International Solid-State Circuits Conference - Digest of Technical Papers.

[4]  Georges G. E. Gielen,et al.  Emerging Yield and Reliability Challenges in Nanometer CMOS Technologies , 2008, 2008 Design, Automation and Test in Europe.

[5]  Giuseppe Serazzi,et al.  JMT: performance engineering tools for system modeling , 2009, PERV.

[6]  Pradip Bose,et al.  A case for guarded power gating for multi-core processors , 2011, 2011 IEEE 17th International Symposium on High Performance Computer Architecture.

[7]  Phillip Restle,et al.  26.5 Adaptive clocking in the POWER9™ processor for voltage droop protection , 2017, 2017 IEEE International Solid-State Circuits Conference (ISSCC).

[8]  David Blaauw,et al.  A Power-Efficient 32 bit ARM Processor Using Timing-Error Detection and Correction for Transient-Error Tolerance and Adaptation to PVT Variation , 2011, IEEE Journal of Solid-State Circuits.

[9]  Ann Gordon-Ross,et al.  A queueing theoretic approach for performance evaluation of low-power multi-core embedded systems , 2011, 2011 IEEE 29th International Conference on Computer Design (ICCD).

[10]  Maartje E. Zonderland Basic Queuing Theory , 2014 .

[11]  Wei Wang,et al.  On-Chip Aging Sensor Circuits for Reliable Nanometer MOSFET Digital Circuits , 2010, IEEE Transactions on Circuits and Systems II: Express Briefs.

[12]  Ravi Shankar,et al.  Survey of Network on Chip (NoC) Architectures & Contributions , 2009 .

[13]  Soraya Ghiasi,et al.  A Distributed Critical-Path Timing Monitor for a 65nm High-Performance Microprocessor , 2007, 2007 IEEE International Solid-State Circuits Conference. Digest of Technical Papers.

[14]  Robert I. Davis,et al.  A Survey of Hard Real-Time Scheduling Algorithms and Schedulability Analysis Techniques for Multiprocessor Systems , 2009 .

[15]  Keith A. Bowman,et al.  A 22 nm All-Digital Dynamically Adaptive Clock Distribution for Supply Voltage Droop Tolerance , 2013, IEEE Journal of Solid-State Circuits.

[16]  Elad Alon,et al.  Raven: A 28nm RISC-V vector processor with integrated switched-capacitor DC-DC converters and adaptive clocking , 2015, 2015 IEEE Hot Chips 27 Symposium (HCS).