Machine Learning to Tackle the Challenges of Transient and Soft Errors in Complex Circuits

The Functional Failure Rate analysis of today’s complex circuits is a difficult task and requires a significant investment in terms of human efforts, processing resources and tool licenses. Thereby, de-rating or vulnerability factors are a major instrument of failure analysis efforts. Usually computationally intensive fault-injection simulation campaigns are required to obtain a fine-grained reliability metrics for the functional level. Therefore, the use of machine learning algorithms to assist this procedure and thus, optimising and enhancing fault injection efforts, is investigated in this paper. Specifically, machine learning models are used to predict accurate per-instance Functional De-Rating data for the full list of circuit instances, an objective that is difficult to reach using classical methods. The described methodology uses a set of per-instance features, extracted through an analysis approach, combining static elements (cell properties, circuit structure, synthesis attributes) and dynamic elements (signal activity). Reference data is obtained through first-principles fault simulation approaches. One part of this reference dataset is used to train the machine learning model and the remaining is used to validate and benchmark the accuracy of the trained tool. The presented methodology is applied on a practical example and various machine learning models are evaluated and compared.

[1]  Yoshua Bengio,et al.  Random Search for Hyper-Parameter Optimization , 2012, J. Mach. Learn. Res..

[2]  J. A. Maestro,et al.  A Methodology for Automatic Insertion of Selective TMR in Digital Circuits Affected by SEUs , 2009, IEEE Transactions on Nuclear Science.

[3]  Gerard V. Trunk,et al.  A Problem of Dimensionality: A Simple Example , 1979, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  R.C. Baumann,et al.  Radiation-induced soft errors in advanced semiconductor technologies , 2005, IEEE Transactions on Device and Materials Reliability.

[5]  Arnaud Virazel,et al.  A Low-Cost Reliability vs. Cost Trade-Off Methodology to Selectively Harden Logic Circuits , 2017, J. Electron. Test..

[6]  Paul D. Franzon,et al.  FreePDK: An Open-Source Variation-Aware Design Kit , 2007, 2007 IEEE International Conference on Microelectronic Systems Education (MSE'07).

[7]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[8]  Robert Tibshirani,et al.  An Introduction to the Bootstrap , 1994 .

[9]  S. Katkoori,et al.  Selective triple Modular redundancy (STMR) based single-event upset (SEU) tolerant synthesis for FPGAs , 2004, IEEE Transactions on Nuclear Science.

[10]  Ron Kohavi,et al.  A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.

[11]  B. Narasimham,et al.  Radiation-Induced Soft Error Rates of Advanced CMOS Bulk Devices , 2006, 2006 IEEE International Reliability Physics Symposium Proceedings.

[12]  Dan Alexandrescu,et al.  Towards optimized functional evaluation of SEE-induced failures in complex designs , 2012, 2012 IEEE 18th International On-Line Testing Symposium (IOLTS).

[13]  Y. Yagil,et al.  A systematic approach to SER estimation and solutions , 2003, 2003 IEEE International Reliability Physics Symposium Proceedings, 2003. 41st Annual..