Uncertainty-Aware Compositional System-Level Reliability Analysis

Continuous technology scaling has increased the susceptibility of today’s electronic devices to manufacturing tolerances and environmental changes. The resulting uncertainty in component reliability can be only approximated or estimated at design time and might propagate to system level. Therefore, uncertainty must be considered to enable the design of robust systems. In this chapter, we propose a methodology for cross-level reliability analysis to tame the ever increasing analysis complexity of contemporary systems under the influence of uncertainties. The presented methodology combines various reliability analysis techniques across different levels of abstraction while providing an explicit modeling of uncertainties. It introduces mechanisms for (a) the composition and decomposition of the system during analysis and (b) converting analysis data between different levels of abstraction through adapters. The developed analysis techniques are integrated in an automatic electronic system-level reliability analysis tool to allow for the evaluation of reliability-increasing techniques and for DSE!. The tool thereby uses meta-heuristic algorithms for optimization and enables the comparison of system implementation candidates with objectives represented by uncertainty distributions.

[1]  Jozef Hooman,et al.  Specification and Compositional Verification of Real-Time Systems , 1991, Lecture Notes in Computer Science.

[2]  K. Misra Reliability Analysis and Prediction: A Methodology Oriented Treatment , 1992 .

[3]  Jonathan P. Bowen,et al.  Safety-critical systems, formal methods and standards , 1993, Softw. Eng. J..

[4]  David W. Coit,et al.  Reliability optimization of series-parallel systems using a genetic algorithm , 1996, IEEE Trans. Reliab..

[5]  Jürgen Teich,et al.  System-Level Synthesis Using Evolutionary Algorithms , 1998, Des. Autom. Embed. Syst..

[6]  Jürgen Teich,et al.  Pareto-Front Exploration with Uncertain Objectives , 2001, EMO.

[7]  James H. Stathis,et al.  Reliability limits for the gate insulator in CMOS technology , 2002, IBM J. Res. Dev..

[8]  Kevin Skadron,et al.  Temperature-aware microarchitecture , 2003, ISCA '03.

[9]  Mahmut T. Kandemir,et al.  Reliability-aware Co-synthesis for Embedded Systems , 2004, Proceedings. 15th IEEE International Conference on Application-Specific Systems, Architectures and Processors, 2004..

[10]  Philipp Limbourg,et al.  Multi-objective Optimization of Problems with Epistemic Uncertainty , 2005, EMO.

[11]  Jürgen Branke,et al.  Evolutionary optimization in uncertain environments-a survey , 2005, IEEE Transactions on Evolutionary Computation.

[12]  Kalyanmoy Deb,et al.  Searching for Robust Pareto-Optimal Solutions in Multi-objective Optimization , 2005, EMO.

[13]  Narayanan Vijaykrishnan,et al.  Reliability concerns in embedded system designs , 2006, Computer.

[14]  Michael Glaß,et al.  Design space exploration of reliable networked embedded systems , 2007, J. Syst. Archit..

[15]  E. Salazar,et al.  Solving advanced multi-objective robust designs by means of multiple objective evolutionary algorithms (MOEA): A reliability application , 2007 .

[16]  Petru Eles,et al.  Synthesis of Fault-Tolerant Embedded Systems , 2008, 2008 Design, Automation and Test in Europe.

[17]  Joseph Sifakis,et al.  Compositional Verification for Component-Based Systems and Application , 2008, ATVA.

[18]  Li Shang,et al.  Application-Specific MPSoC Reliability Optimization , 2008, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[19]  Oliver Bringmann,et al.  ESL power analysis of embedded processors for temperature and reliability estimations , 2009, CODES+ISSS '09.

[20]  Donald E. Thomas,et al.  Cost-effective slack allocation for lifetime improvement in NoC-based MPSoCs , 2010, 2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010).

[21]  Martin Lukasiewycz,et al.  Towards scalable system-level reliability analysis , 2010, Design Automation Conference.

[22]  Mohammad Abdullah Al Faruque,et al.  Runtime Thermal Management Using Software Agents for Multi- and Many-Core Architectures , 2010, IEEE Design & Test of Computers.

[23]  Li Shang,et al.  System-level reliability modeling for MPSoCs , 2010, 2010 IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS).

[24]  Martin Lukasiewycz,et al.  Symbolic system level reliability analysis , 2010, 2010 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[25]  Michael Glaß,et al.  Cross-Level Compositional Reliability Analysis for Embedded Systems , 2012, SAFECOMP.

[26]  Zhiyong Zhang,et al.  Reliability tests and improvements for Sc-contacted n-type carbon nanotube transistors , 2013, Nano Research.

[27]  Michael Glaß,et al.  Automatic success tree-based reliability analysis for the consideration of transient and permanent faults , 2013, 2013 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[28]  Jaeha Kim,et al.  Variability-Aware, Discrete Optimization for Analog Circuits , 2014, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[29]  Michael Glaß,et al.  An efficient technique for computing importance measures in automatic design of dependable embedded systems , 2014, 2014 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS).

[30]  Norbert Wehn,et al.  Resilience Articulation Point (RAP): Cross-layer dependability modeling for nanometer system-on-chip resilience , 2014, Microelectron. Reliab..

[31]  Michael Glaß,et al.  Multi-objective local-search optimization using reliability importance measuring , 2014, 2014 51st ACM/EDAC/IEEE Design Automation Conference (DAC).

[32]  Emanuele Borgonovo,et al.  Importance measures in time-dependent reliability analysis and system design , 2015 .

[33]  Michael Glaß,et al.  Uncertainty-aware reliability analysis and optimization , 2015, 2015 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[34]  Saeed Safari,et al.  A cross-layer approach to online adaptive reliability prediction of transient faults , 2015, 2015 IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems (DFTS).

[35]  Xavier Défago,et al.  Reliability prediction for component-based software systems: Dealing with concurrent and propagating errors , 2015, Sci. Comput. Program..

[36]  Emanuele Borgonovo,et al.  Production , Manufacturing and Logistics A new time-independent reliability importance measure , 2016 .

[37]  Mehdi Baradaran Tahoori,et al.  A cross-layer analysis of Soft Error, aging and process variation in Near Threshold Computing , 2016, 2016 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[38]  Pratyusha Rakshit,et al.  Noisy evolutionary optimization algorithms - A comprehensive survey , 2017, Swarm Evol. Comput..

[39]  Martin Lukasiewycz,et al.  Hybrid Optimization Techniques for System-Level Design Space Exploration , 2017, Handbook of Hardware/Software Codesign.

[40]  Emanuele Borgonovo,et al.  On the Boolean extension of the Birnbaum importance to non-coherent systems , 2017, Reliab. Eng. Syst. Saf..

[41]  Michael Glaß,et al.  Simulation-based uncertainty correlation modeling in reliability analysis , 2018 .

[42]  Jürgen Teich,et al.  Probabilistic Dominance in Robust Multi-Objective Optimization , 2018, 2018 IEEE Congress on Evolutionary Computation (CEC).

[43]  Jürgen Teich,et al.  Efficient Treatment of Uncertainty in System Reliability Analysis using Importance Measures , 2019, 2019 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN).