Machine Learning Clustering Techniques for Selective Mitigation of Critical Design Features

Selective mitigation or selective hardening is an effective technique to obtain a good trade-off between the improvements in the overall reliability of a circuit and the hardware overhead induced by the hardening techniques. Selective mitigation relies on preferentially protecting circuit instances according to their susceptibility and criticality. However, ranking circuit parts in terms of vulnerability usually requires computationally intensive fault-injection simulation campaigns. This paper presents a new methodology which uses machine learning clustering techniques to group flip-flops with similar expected contributions to the overall functional failure rate, based on the analysis of a compact set of features combining attributes from static elements and dynamic elements. Fault simulation campaigns can then be executed on a per-group basis, significantly reducing the time and cost of the evaluation. The effectiveness of grouping similar sensitive flip-flops by machine learning clustering algorithms is evaluated on a practical example.Different clustering algorithms are applied and the results are compared to an ideal selective mitigation obtained by exhaustive fault-injection simulation.

[1]  Dan Alexandrescu,et al.  Machine Learning to Tackle the Challenges of Transient and Soft Errors in Complex Circuits , 2019, 2019 IEEE 25th International Symposium on On-Line Testing and Robust System Design (IOLTS).

[2]  Sudhakar M. Reddy,et al.  Scalable Calculation of Logical Masking Effects for Selective Hardening Against Soft Errors , 2008, 2008 IEEE Computer Society Annual Symposium on VLSI.

[3]  S. Katkoori,et al.  Selective triple Modular redundancy (STMR) based single-event upset (SEU) tolerant synthesis for FPGAs , 2004, IEEE Transactions on Nuclear Science.

[4]  J. A. Maestro,et al.  A Methodology for Automatic Insertion of Selective TMR in Digital Circuits Affected by SEUs , 2009, IEEE Transactions on Nuclear Science.

[5]  S. T. Buckland,et al.  An Introduction to the Bootstrap. , 1994 .

[6]  Adrian Evans,et al.  Clustering techniques and statistical fault injection for selective mitigation of SEUs in flip-flops , 2013, International Symposium on Quality Electronic Design (ISQED).

[7]  Arnaud Virazel,et al.  A Low-Cost Reliability vs. Cost Trade-Off Methodology to Selectively Harden Logic Circuits , 2017, J. Electron. Test..

[8]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[9]  Michail Maniatakos,et al.  Workload-driven selective hardening of control state elements in modern microprocessors , 2010, 2010 28th VLSI Test Symposium (VTS).

[10]  Celia López,et al.  Extensive SEU Impact Analysis of a PIC Microprocessor for Selective Hardening , 2010, IEEE Transactions on Nuclear Science.

[11]  B.W. Johnson,et al.  A state of research review on fault injection techniques and a case study , 2005, Annual Reliability and Maintainability Symposium, 2005. Proceedings..