Bias-Compensated Least Squares Identification of Distributed Thermal Models for Many-Core Systems-on-Chip

The thermal wall for many-core systems on-chip calls for advanced management techniques to maximize performance, while capping temperatures. Distributed and compact thermal models are a cornerstone for such techniques. System identification methodologies allow to extract models directly from the target device thermal response. Unfortunately, standard Auto-Regressive eXogenous models and Least Squares techniques cannot effectively tackle both model approximation and measurement noise typical of real systems. In this work, we propose a novel distributed identification strategy to derive distributed interacting thermal models. The presented method can cope with both process noise and temperature sensor noise affecting inputs and outputs of the adopted models. Online and offline versions are presented, and issues related to model order, sampling time and input stimuli are addressed. The proposed method is applied to the Intel's Single-chip-Cloud-Computer many-core prototype.

[1]  Andrea Bartolini,et al.  Thermal models characterization for reliable temperature capping and performance optimization in Multiprocessor Systems on Chip , 2012, 2012 American Control Conference (ACC).

[2]  W. Zheng A least-squares based method for autoregressive signals in the presence of noise , 1999 .

[3]  Kai Ma,et al.  Temperature-constrained power control for chip multiprocessors with online model estimation , 2009, ISCA '09.

[4]  Luca Benini,et al.  An Effective Gray-Box Identification Procedure for Multicore Thermal Modeling , 2014, IEEE Transactions on Computers.

[5]  Umberto Soverini,et al.  Kalman filtering in extended noise environments , 2005, IEEE Transactions on Automatic Control.

[6]  Torsten Söderström,et al.  Errors-in-variables methods in system identification , 2018, Autom..

[7]  Diana Marculescu,et al.  A learning-based autoregressive model for fast transient thermal analysis of chip-multiprocessors , 2012, 17th Asia and South Pacific Design Automation Conference.

[8]  Biao Huang,et al.  System Identification , 2000, Control Theory for Physicists.

[9]  Wei Xing Zheng,et al.  Convergence properties of bias‐eliminating algorithms for errors‐in‐variables identification , 2005 .

[10]  W. Zheng A modified method for closed-loop identification of transfer function models , 2002 .

[11]  Paul Ampadu,et al.  Exploiting Programmable Temperature Compensation Devices to Manage Temperature-Induced Delay Uncertainty , 2012, IEEE Transactions on Circuits and Systems I: Regular Papers.

[12]  Timothy Mattson,et al.  A 48-Core IA-32 message-passing processor with DVFS in 45nm CMOS , 2010, 2010 IEEE International Solid-State Circuits Conference - (ISSCC).

[13]  Luca Benini,et al.  Optimization and Controlled Systems: A Case Study on Thermal Aware Workload Dispatching , 2012, AAAI.

[14]  Alon Naveh,et al.  Power management architecture of the 2nd generation Intel® Core microarchitecture, formerly codenamed Sandy Bridge , 2011, IEEE Hot Chips Symposium.

[15]  W. Zheng Parametric identification of linear systems operating under feedback control , 2001 .

[16]  Lennart Ljung,et al.  System Identification: Theory for the User , 1987 .

[17]  Stephen P. Boyd,et al.  Processor Speed Control With Thermal Constraints , 2009, IEEE Transactions on Circuits and Systems I: Regular Papers.

[18]  Sherief Reda,et al.  Consistent runtime thermal prediction and control through workload phase detection , 2010, Design Automation Conference.

[19]  Samuel Williams,et al.  The Landscape of Parallel Computing Research: A View from Berkeley , 2006 .

[20]  Luca Benini,et al.  Thermal and Energy Management of High-Performance Multicores: Distributed and Self-Calibrating Model-Predictive Controller , 2013, IEEE Transactions on Parallel and Distributed Systems.

[21]  Roberto Diversi,et al.  A Bias-Compensated Identification Approach for Noisy FIR Models , 2008, IEEE Signal Processing Letters.

[22]  Kevin Skadron,et al.  Recent thermal management techniques for microprocessors , 2012, CSUR.

[23]  Sarma B. K. Vrudhula,et al.  Performance Optimal Online DVFS and Task Migration Techniques for Thermally Constrained Multi-Core Processors , 2011, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[24]  Tajana Simunic,et al.  Accurate Direct and Indirect On-Chip Temperature Sensing for Efficient Dynamic Thermal Management , 2010, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[25]  Tajana Simunic,et al.  Utilizing Predictors for Efficient Thermal Management in Multiprocessor SoCs , 2009, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[26]  Ronan Grimes,et al.  Active cooling of a mobile phone handset , 2010 .

[27]  Karthikeyan Sankaralingam,et al.  Dark Silicon and the End of Multicore Scaling , 2012, IEEE Micro.

[28]  Michael Bedford Taylor,et al.  Is dark silicon useful? Harnessing the four horsemen of the coming dark silicon apocalypse , 2012, DAC Design Automation Conference 2012.

[29]  Umberto Soverini,et al.  Identification of ARX and ARARX Models in the Presence of Input and Output Noises , 2010, Eur. J. Control.

[30]  Shekhar Y. Borkar,et al.  Design challenges of technology scaling , 1999, IEEE Micro.

[31]  Luca Benini,et al.  A System Level Approach to Multi-core Thermal Sensors Calibration , 2011, PATMOS.

[32]  Wei Xing Zheng Fast identification of autoregressive signals from noisy observations , 2005, IEEE Trans. Circuits Syst. II Express Briefs.

[33]  Ravi Mahajan,et al.  On-chip cooling by superlattice-based thin-film thermoelectrics. , 2009, Nature nanotechnology.

[34]  Luca Benini,et al.  Single-Chip Cloud Computer thermal model , 2011, 2011 17th International Workshop on Thermal Investigations of ICs and Systems (THERMINIC).

[35]  James Charles,et al.  Evaluation of the Intel® Core™ i7 Turbo Boost feature , 2009, 2009 IEEE International Symposium on Workload Characterization (IISWC).

[36]  Luca Benini,et al.  SCC thermal model identification via advanced bias-compensated least-squares , 2013, 2013 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[37]  G.E. Moore,et al.  Cramming More Components Onto Integrated Circuits , 1998, Proceedings of the IEEE.