Fault injection acceleration by simultaneous injection of non-interacting faults

Fault injection is the de facto standard for evaluating the sensitivity of digital systems to transient errors. Due to various masking effects only a very small portion of the injected faults lead to system-level failures, and hence, too many faults have to be injected for achieving statistically meaningful results. At the same time, since the majority of injected faults will be masked, lots of simulation cycles will be wasted for tracking each and every injected fault separately. In this paper, we propose an opportunistic acceleration technique which evaluates the impact of multiple non-interacting faults in one workload execution. In case no failure is observed, this technique skips the evaluation of those individual faults which leads to a significant speedup. The experimental results on the Leon3 processor show that our proposed technique shortens the fault injection runtime by two orders of magnitude.

[1]  Shubu Mukherjee,et al.  Architecture Design for Soft Errors , 2008 .

[2]  Régis Leveugle,et al.  Statistical fault injection: Quantified error and confidence , 2009, 2009 Design, Automation & Test in Europe Conference & Exhibition.

[3]  Michail Maniatakos,et al.  AVF Analysis Acceleration via Hierarchical Fault Pruning , 2011, 2011 Sixteenth IEEE European Test Symposium.

[4]  Ulf Schlichtmann,et al.  Technology-aware system failure analysis in the presence of soft errors by Mixture Importance Sampling , 2013, 2013 IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems (DFTS).

[5]  E. Ibe,et al.  Impact of Scaling on Neutron-Induced Soft Error in SRAMs From a 250 nm to a 22 nm Design Rule , 2010, IEEE Transactions on Electron Devices.

[6]  Chen-Yong Cher,et al.  Soft error resiliency characterization and improvement on IBM BlueGene/Q processor using accelerated proton irradiation , 2014, 2014 International Test Conference.

[7]  Seyed Ghassem Miremadi,et al.  A fast, flexible, and easy-to-develop FPGA-based fault injection technique , 2014, Microelectron. Reliab..

[8]  Michael S. Floyd,et al.  Fault - tolerant design of the IBM POWER6™ microprocessor , 2007, 2007 IEEE Hot Chips 19 Symposium (HCS).

[9]  Michael J. Wirthlin,et al.  Estimating Soft Processor Soft Error Sensitivity through Fault Injection , 2015, 2015 IEEE 23rd Annual International Symposium on Field-Programmable Custom Computing Machines.

[10]  Alfredo Benso,et al.  Fault-list collapsing for fault-injection experiments , 1998, Annual Reliability and Maintainability Symposium. 1998 Proceedings. International Symposium on Product Quality and Integrity.

[11]  Alan Wood,et al.  The impact of new technology on soft error rates , 2011, 2011 International Reliability Physics Symposium.

[12]  Daniela Munteanu,et al.  Soft Errors: From Particles to Circuits , 2015 .

[13]  Adrian Evans,et al.  Comprehensive Analysis of Sequential and Combinational Soft Errors in an Embedded Processor , 2015, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[14]  Vishwani D. Agrawal,et al.  Fault collapsing via functional dominance , 2003, International Test Conference, 2003. Proceedings. ITC 2003..

[15]  Trevor Mudge,et al.  MiBench: A free, commercially representative embedded benchmark suite , 2001 .

[16]  Dan Alexandrescu A comprehensive soft error analysis methodology for SoCs/ASICs memory instances , 2011, 2011 IEEE 17th International On-Line Testing Symposium.

[17]  Jacob A. Abraham,et al.  Rethinking error injection for effective resilience , 2014, 2014 19th Asia and South Pacific Design Automation Conference (ASP-DAC).

[18]  Giovanni Squillero,et al.  New techniques for speeding-up fault-injection campaigns , 2002, Proceedings 2002 Design, Automation and Test in Europe Conference and Exhibition.

[19]  Mehdi Baradaran Tahoori,et al.  Layout-Based Modeling and Mitigation of Multiple Event Transients , 2016, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[20]  Mehdi Baradaran Tahoori,et al.  Fault injection acceleration by architectural importance sampling , 2015, 2015 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS).

[21]  Régis Leveugle,et al.  Towards automated fault pruning with Petri Nets , 2009, 2009 15th IEEE International On-Line Testing Symposium.