Reliability annotations to formal specifications of context-sensitive safety properties in embedded systems

As the aspect of reliability is becoming increasingly important in the context of safety-critical embedded systems, developing formalism for specifying the reliability requirements for such systems has become very relevant. We present a formalism for modeling the reliability requirement succinctly for safety-critical embedded systems and propose the semantics over the task schedule of the embedded systems controller. We introduce the notion of reliability deficiency to represent the difference between the specified and the actual value of the reliability achieved by a schedule and present techniques to make up the reliability deficiency. The presented approach is primarily applicable to specify the reliability requirements of context-sensitive tasks executed by a real-time software system so that they can overcome transient failures using temporal redundancy, i.e., repetitive execution of the same task. We illustrate our formalism and the proposed techniques using suitable scenarios from the automotive domain.

[1]  Byung Kook Kim,et al.  Probabilistic Schedulability Analysis of Harmonic Multi-Task Systems with Dual-Modular Temporal Redundancy , 2004, Real-Time Systems.

[2]  Christel Baier,et al.  Model-Checking Algorithms for Continuous-Time Markov Chains , 2002, IEEE Trans. Software Eng..

[3]  Sérgio Vale Aguiar Campos,et al.  ProbVerus: Probabilistic Symbolic Model Checking , 1999, ARTS.

[4]  David L. Dill,et al.  A Specification Methodology by a Collection of Compact Properties as Applied to the Intel® ItaniumTM Processor Bus Protocol , 2001, CHARME.

[5]  R. Ramaswami,et al.  Book Review: Design and Analysis of Fault-Tolerant Digital Systems , 1990 .

[6]  Israel Koren,et al.  Fault-Tolerant Systems , 2007 .

[7]  Pallab Dasgupta,et al.  A Roadmap for Formal Property Verification , 2006 .

[8]  Christel Baier,et al.  Principles of model checking , 2008 .

[9]  Brian Randell,et al.  System structure for software fault tolerance , 1975, IEEE Transactions on Software Engineering.

[10]  Marta Z. Kwiatkowska,et al.  Stochastic Model Checking , 2007, SFM.

[11]  Hagbae Kim,et al.  A Time Redundancy Approach to TMR Failures Using Fault-State Likelihoods , 1994, IEEE Trans. Computers.

[12]  Byung Kook Kim,et al.  Reliability analysis of real-time controllers with dual-modular temporal redundancy , 1999, Proceedings Sixth International Conference on Real-Time Computing Systems and Applications. RTCSA'99 (Cat. No.PR00306).

[13]  M. Kameyama,et al.  Design of Dependent-Failure-Tolerant Microcomputer System Using Triple-Modular Redundancy , 1980 .

[14]  Tom Henderson,et al.  Logical sensor systems , 1984, J. Field Robotics.

[15]  Micha Sharir,et al.  Probabilistic Propositional Temporal Logics , 1986, Inf. Control..

[16]  Algirdas Avizienis,et al.  Fault Tolerance by Design Diversity: Concepts and Experiments , 1984, Computer.

[17]  Ansuman Banerjee,et al.  Auxiliary state machines + context-triggered properties in verification , 2008, TODE.

[18]  Bengt Jonsson,et al.  A logic for reasoning about time and reliability , 1990, Formal Aspects of Computing.

[19]  Marta Z. Kwiatkowska,et al.  PRISM 2.0: a tool for probabilistic model checking , 2004, First International Conference on the Quantitative Evaluation of Systems, 2004. QEST 2004. Proceedings..

[20]  Dave E. Eckhardt,et al.  A theoretical investigation of generalized voters for redundant systems , 1989, [1989] The Nineteenth International Symposium on Fault-Tolerant Computing. Digest of Papers.

[21]  Robert Geist,et al.  Selection of a checkpoint interval in a critical-task environment , 1988 .

[22]  Hagbae Kim,et al.  Design and Analysis of an Optimal Instruction-Retry Policy for TMR Controller Computers , 1996, IEEE Trans. Computers.

[23]  Gianfranco Ciardo,et al.  SMART: the stochastic model checking analyzer for reliability and timing , 2004 .

[24]  B. Bose,et al.  Coding theory for fault-tolerant systems , 1986 .

[25]  Andrew Hinton,et al.  PRISM: A Tool for Automatic Verification of Probabilistic Systems , 2006, TACAS.

[26]  Gianfranco Balbo,et al.  Introduction to Generalized Stochastic Petri Nets , 2007, SFM.

[27]  Azad M. Madni,et al.  Hierarchical Aggregation and Intelligent Monitoring and Control in Fault-Tolerant Wireless Sensor Networks , 2007, IEEE Systems Journal.

[28]  Ansuman Banerjee,et al.  Abstraction refinement for state space partitioning based on auxiliary state machines , 2009, TENCON 2009 - 2009 IEEE Region 10 Conference.

[29]  C. Krishna,et al.  Reliability of checkpointed real-time systems using time redundancy , 1993 .

[30]  Robert S. Swarz,et al.  Reliable Computer Systems: Design and Evaluation , 1992 .

[31]  Thomas C. Henderson,et al.  FAULT TOLERANT SENSOR SCHEME. , 1984 .

[32]  Fred Kröger,et al.  Temporal Logic of Programs , 1987, EATCS Monographs on Theoretical Computer Science.