CEP: Correlated Error Propagation for Hierarchical Soft Error Analysis

Due to the continuous technology scaling, soft error becomes a major reliability issue at nanoscale technologies. Single or multiple event transients at low levels can result in multiple correlated bit flips at logic or higher abstraction levels. Addressing this correlation is essential for accurate low-level soft error rate estimation, and more importantly, for the cross-level error abstraction, e.g. from bit errors at logic level to word errors at register-transfer level. This paper proposes a novel error estimation method to take into consideration both signal and error correlations. It unifies the treatment of error-free signals and erroneous signals, so that the computation of error probabilities and correlations can be carried out using techniques for signal probabilities and correlations calculation. The proposed method not only reports accurate error probabilities when internal gates are impaired by soft errors, but also gives quantification of the error correlations in their propagation process. This feature enables our method to be a versatile technique used in high-level error estimation. The experimental results validate our proposed technique showing that compared with Monte-Carlo simulation, it is 5 orders of magnitude faster, while the average inaccuracy of error probability estimation is only 0.02.

[1]  Robert Baumann,et al.  Soft errors in advanced computer systems , 2005, IEEE Design & Test of Computers.

[2]  Narayanan Vijaykrishnan,et al.  SEAT-LA: a soft error analysis tool for combinational logic , 2006, 19th International Conference on VLSI Design held jointly with 5th International Conference on Embedded Systems Design (VLSID'06).

[3]  Kartik Mohanram,et al.  Reliability Analysis of Logic Circuits , 2009, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[4]  Sanjukta Bhanja,et al.  Cascaded Bayesian inferencing for switching activity estimation with correlated inputs , 2004, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[5]  Armando Astarloa,et al.  An automatic experimental set-up for robustness analysis of designs implemented on SRAM FPGAS , 2011, 2011 International Symposium on System on Chip (SoC).

[6]  Bin Zhang,et al.  FASER: fast analysis of soft error susceptibility for cell-based designs , 2006, 7th International Symposium on Quality Electronic Design (ISQED'06).

[7]  Dirk P. Kroese,et al.  Simulation and the Monte Carlo Method (Wiley Series in Probability and Statistics) , 1981 .

[8]  Sara Blanc,et al.  Enhancement of Fault Injection Techniques Based on the Modification of VHDL Code , 2008, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[9]  Naresh R. Shanbhag,et al.  Soft-Error-Rate-Analysis (SERA) Methodology , 2006, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[10]  Mehdi Baradaran Tahoori,et al.  Soft error rate estimation of digital circuits in the presence of Multiple Event Transients (METs) , 2011, 2011 Design, Automation & Test in Europe.

[11]  Cecilia Metra,et al.  Multiple transient faults in logic: an issue for next generation ICs? , 2005, 20th IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems (DFT'05).

[12]  Srinivas Devadas,et al.  Power Estimation Using Probability Polynomials , 2004, Des. Autom. Embed. Syst..

[13]  John P. Hayes,et al.  Accurate reliability evaluation and enhancement via probabilistic transfer matrices , 2005, Design, Automation and Test in Europe.

[14]  Mario García-Valderas,et al.  Soft Error Sensitivity Evaluation of Microprocessors by Multilevel Emulation-Based Fault Injection , 2012, IEEE Transactions on Computers.

[15]  Hao Chen,et al.  Reliability evaluation of logic circuits using probabilistic gate models , 2011, Microelectron. Reliab..

[16]  Lorenzo Alvisi,et al.  Modeling the effect of technology trends on the soft error rate of combinational logic , 2002, Proceedings International Conference on Dependable Systems and Networks.

[17]  Liang Chen,et al.  An efficient probability framework for error propagation and correlation estimation , 2012, 2012 IEEE 18th International On-Line Testing Symposium (IOLTS).

[18]  Sanjay J. Patel,et al.  Characterizing the effects of transient faults on a high-performance processor pipeline , 2004, International Conference on Dependable Systems and Networks, 2004.

[19]  N. Seifert,et al.  Comparison of alpha-particle and neutron-induced combinational and sequential logic error rates at the 32nm technology node , 2009, 2009 IEEE International Reliability Physics Symposium.

[20]  Gábor Csárdi,et al.  The igraph software package for complex network research , 2006 .

[21]  John P. Hayes,et al.  Trigonometric method to handle realistic error probabilities in logic circuits , 2011, 2011 Design, Automation & Test in Europe.

[22]  Sanjukta Bhanja,et al.  Probabilistic Error Modeling for Nano-Domain Logic Circuits , 2009, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[23]  Xiaodong Li,et al.  SoftArch: an architecture-level tool for modeling and analyzing soft errors , 2005, 2005 International Conference on Dependable Systems and Networks (DSN'05).

[24]  B. L. Bhuva,et al.  Comparison of Combinational and Sequential Error Rates for a Deep Submicron Process , 2011, IEEE Transactions on Nuclear Science.

[25]  B. Ricco,et al.  Estimate of signal probability in combinational logic networks , 1989, [1989] Proceedings of the 1st European Test Conference.

[26]  Kia Bazargan,et al.  Estimation and optimization of reliability of noisy digital circuits , 2009, 2009 10th International Symposium on Quality Electronic Design.

[27]  Prabhakar Kudva,et al.  Soft-error resilience of the IBM POWER6 processor , 2008, IBM J. Res. Dev..

[28]  Diana Marculescu,et al.  Circuit Reliability Analysis Using Symbolic Techniques , 2006, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[29]  Shekhar Y. Borkar,et al.  Designing reliable systems from unreliable components: the challenges of transistor variability and degradation , 2005, IEEE Micro.

[30]  Diana Marculescu,et al.  Multiple Transient Faults in Combinational and Sequential Circuits: A Systematic Approach , 2010, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[31]  David Blaauw,et al.  An Efficient Static Algorithm for Computing the Soft Error Rates of Combinational Circuits , 2006, Proceedings of the Design Automation & Test in Europe Conference.

[32]  Radu Marculescu,et al.  Probabilistic modeling of dependencies during switching activity analysis , 1998, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[33]  J. Hammersley SIMULATION AND THE MONTE CARLO METHOD , 1982 .

[34]  H. Asadi,et al.  Soft Error Derating Computation in Sequential Circuits , 2006, 2006 IEEE/ACM International Conference on Computer Aided Design.

[35]  Trevor Mudge,et al.  MiBench: A free, commercially representative embedded benchmark suite , 2001 .

[36]  Massoud Pedram,et al.  Probabilistic error propagation in logic circuits using the Boolean difference calculus , 2008, 2008 IEEE International Conference on Computer Design.

[37]  Joel S. Emer,et al.  The soft error problem: an architectural perspective , 2005, 11th International Symposium on High-Performance Computer Architecture.