Instruction-Level NBTI Stress Estimation and Its Application in Runtime Aging Prediction for Embedded Processors

Lifetime reliability management of miniaturized CMOS devices continuously gets more importance with the shrinking of technology size. Neither of existing design-time solutions (like guard-banding) and runtime methods (like reactive monitoring) does efficiently address this issue; rather, proactive approaches, which use runtime aging prediction, are getting more promising to provide resiliency. Among various reliability threatening mechanisms in recent technologies, negative bias temperature instability is the dominant factor; it depends on multiple time-varying operational parameters, including temperature, supply voltage, and stress. This paper proposes an efficient instruction-level stress estimation model; accordingly, it introduces a runtime aging prediction approach for embedded processors, taking simultaneous impacts of the temperature, supply voltage, and stress variations. We propose instruction degradation factor and architecture degradation factor metrics, respectively, for fine-grained stress estimation and recurring runtime aging prediction. We also provide a simulation environment for model validation. Simulation results of several benchmarks show that the proposed stress estimation model has an accuracy of about 92%, indicating that the method is accurate enough, yet simple for runtime usage.

[1]  David Blaauw,et al.  Process variation and temperature-aware reliability management , 2010, 2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010).

[2]  Mehdi Baradaran Tahoori,et al.  Fine-grained aging prediction based on the monitoring of run-time stress using DfT infrastructure , 2015, 2015 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[3]  M. Alam,et al.  A Comparative Study of Different Physics-Based NBTI Models , 2013, IEEE Transactions on Electron Devices.

[4]  C. Kim,et al.  Silicon Odometer: An On-Chip Reliability Monitor for Measuring Frequency Degradation of Digital Circuits , 2008, IEEE Journal of Solid-State Circuits.

[5]  Ulf Schlichtmann,et al.  A compact model for NBTI degradation and recovery under use-profile variations and its application to aging analysis of digital integrated circuits , 2014, Microelectron. Reliab..

[6]  Jörg Henkel,et al.  Reliability-aware design to suppress aging , 2016, 2016 53nd ACM/EDAC/IEEE Design Automation Conference (DAC).

[7]  Pradip Bose,et al.  Metrics for Architecture-Level Lifetime Reliability Analysis , 2008, ISPASS 2008 - IEEE International Symposium on Performance Analysis of Systems and software.

[8]  J. W. McPherson,et al.  Time-to-Failure Modeling , 2010 .

[9]  Mehdi Baradaran Tahoori,et al.  Reducing NBTI-induced processor wearout by exploiting the timing slack of instructions , 2012, CODES+ISSS.

[10]  Omer Khan,et al.  A self-adaptive system architecture to address transistor aging , 2009, 2009 Design, Automation & Test in Europe Conference & Exhibition.

[11]  Keith A. Bowman,et al.  Circuit techniques for dynamic variation tolerance , 2009, 2009 46th ACM/IEEE Design Automation Conference.

[12]  Geoff V. Merrett,et al.  Workload Change Point Detection for Runtime Thermal Management of Embedded Systems , 2016, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[13]  J. W. McPherson,et al.  Reliability challenges for 45nm and beyond , 2006, 2006 43rd ACM/IEEE Design Automation Conference.

[14]  Tao-Tao Zhu,et al.  Eliminating Timing Errors Through Collaborative Design to Maximize the Throughput , 2017, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[15]  David Blaauw,et al.  Reliability modeling and management in dynamic microprocessor-based systems , 2006, 2006 43rd ACM/IEEE Design Automation Conference.

[16]  Bharadwaj Veeravalli,et al.  Reliability and Energy-Aware Mapping and Scheduling of Multimedia Applications on Multiprocessor Systems , 2016, IEEE Transactions on Parallel and Distributed Systems.

[17]  Takahiro Seki,et al.  Dynamic voltage and frequency management for a low-power embedded microprocessor , 2005, 2004 IEEE International Solid-State Circuits Conference (IEEE Cat. No.04CH37519).

[18]  S. Chatterjee,et al.  Regression Analysis by Example , 1979 .

[19]  Massoud Pedram,et al.  Workload and temperature dependent evaluation of BTI-induced lifetime degradation in digital circuits , 2015, Microelectron. Reliab..

[20]  Mehdi Baradaran Tahoori,et al.  Aging- and Variation-Aware Delay Monitoring Using Representative Critical Path Selection , 2015, TODE.

[21]  Alireza Ejlali,et al.  An Accurate Instruction-Level Energy Estimation Model and Tool for Embedded Systems , 2013, IEEE Transactions on Instrumentation and Measurement.

[22]  Sied Mehdi Fakhraie,et al.  Fast and accurate architectural vulnerability analysis for embedded processors using Instruction Vulnerability Factor , 2016, Microprocess. Microsystems.

[23]  Kaushik Roy,et al.  Impact of SoC power management techniques on verification and testing , 2009, 2009 10th International Symposium on Quality Electronic Design.

[24]  Yanling Wang,et al.  Prediction of NBTI Degradation in Dynamic Voltage Frequency Scaling Operations , 2016, IEEE Transactions on Device and Materials Reliability.

[25]  Mehdi Baradaran Tahoori,et al.  Statistical analysis of BTI in the presence of process-induced voltage and temperature variations , 2013, 2013 18th Asia and South Pacific Design Automation Conference (ASP-DAC).

[26]  Axel Jantsch,et al.  Toward Smart Embedded Systems , 2016, ACM Trans. Embed. Comput. Syst..

[27]  Mehdi Baradaran Tahoori,et al.  On-chip voltage-droop prediction using support-vector machines , 2014, 2014 IEEE 32nd VLSI Test Symposium (VTS).

[28]  Mehrdad Nourani,et al.  Controlling Aging in Timing-Critical Paths , 2016, IEEE Design & Test.

[29]  Mehdi Baradaran Tahoori,et al.  On-line prediction of NBTI-induced aging rates , 2015, 2015 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[30]  Yiorgos Makris,et al.  Workload characterization and prediction: A pathway to reliable multi-core systems , 2015, 2015 IEEE 21st International On-Line Testing Symposium (IOLTS).

[31]  Li Shang,et al.  System-level reliability modeling for MPSoCs , 2010, 2010 IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS).

[32]  Bharadwaj Veeravalli,et al.  Energy-aware task mapping and scheduling for reliable embedded computing systems , 2014, ACM Trans. Embed. Comput. Syst..

[33]  Mircea R. Stan,et al.  Modeling and analyzing NBTI in the presence of Process Variation , 2011, 2011 12th International Symposium on Quality Electronic Design.

[34]  Enrico Macii,et al.  Characterizing the Activity Factor in NBTI Aging Models for Embedded Cores , 2015, ACM Great Lakes Symposium on VLSI.

[35]  Cristinel Ababei,et al.  Investigation of DVFS based dynamic reliability management for chip multiprocessors , 2015, 2015 International Conference on High Performance Computing & Simulation (HPCS).

[36]  Trevor Mudge,et al.  Razor: a low-power pipeline based on circuit-level timing speculation , 2003, Proceedings. 36th Annual IEEE/ACM International Symposium on Microarchitecture, 2003. MICRO-36..

[37]  Cristiana Bolchini,et al.  Lifetime-aware load distribution policies in multi-core systems: An in-depth analysis , 2016, 2016 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[38]  Ulf Schlichtmann,et al.  Aging analysis of circuit timing considering NBTI and HCI , 2009, 2009 15th IEEE International On-Line Testing Symposium.

[39]  S. Mukhopadhyay,et al.  A comprehensive AC / DC NBTI model: Stress, recovery, frequency, duty cycle and process dependence , 2013, 2013 IEEE International Reliability Physics Symposium (IRPS).

[40]  Qiang Xu,et al.  AgeSim: A simulation framework for evaluating the lifetime reliability of processor-based SoCs , 2010, 2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010).

[41]  Yu Cao,et al.  The Impact of NBTI Effect on Combinational Circuit: Modeling, Simulation, and Analysis , 2010, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[42]  John Keane,et al.  An On-Chip NBTI Sensor for Measuring pMOS Threshold Voltage Degradation , 2010, IEEE Trans. Very Large Scale Integr. Syst..

[43]  Mayler G. A. Martins,et al.  Open Cell Library in 15nm FreePDK Technology , 2015, ISPD.

[44]  Xiaobo Sharon Hu,et al.  An online wear state monitoring methodology for off-the-shelf embedded processors , 2015, 2015 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS).

[45]  Lide Zhang,et al.  Scheduled voltage scaling for increasing lifetime in the presence of NBTI , 2009, 2009 Asia and South Pacific Design Automation Conference.

[46]  Derong Liu,et al.  OSFA: A New Paradigm of Aging Aware Gate-Sizing for Power/Performance Optimizations Under Multiple Operating Conditions , 2016, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.