We propose a two-tiered hierarchical approach for detecting faults in embedded control software during their runtime operation: The observed behavior is monitored against the appropriate specifications at two different levels, namely, the software level and the controlled-system level. (The additional controlled- system level monitoring safeguards against any possible incompleteness at the software level monitoring.) A software fault is immediately detected when an observed behavior is rejected by a software level monitor. In contrast, when a system level monitor rejects an observed behavior it indicates a system level failure, and an additional isolation step is required to conclude whether a software fault occurred. This is done by tracking the executed behavior in the system model comprising of the models for the software and those for the nonfaulty hardware components: An acceptance by such a model indicates the presence of a software fault. The design of both the software-level and system-level monitors is modular and hence scalable (there exists one monitor for each property), and further the monitors are constructed directly from the property specifications and do not require any software or system model. Such models are required only for the fault isolation step when the detection occurs at the system level. We use input-output extended finite automata (I/O- EFA) for software as well as system level modeling, and also for modeling the property monitors. Note since the control changes only at the discrete times when the system/environment states are sampled, the controlled- system has a discrete-time hybrid dynamics which can be modeled as an I/O-EFA.
[1]
Zohar Manna,et al.
Temporal verification of reactive systems - safety
,
1995
.
[2]
Clifton A. Ericson,et al.
Hazard Analysis Techniques for System Safety: Ericson/Hazard Analysis Techniques for System Safety
,
2005
.
[3]
Zohar Manna,et al.
Temporal Verification of Reactive Systems
,
1995,
Springer New York.
[4]
Flaviu Cristian,et al.
Exception Handling and Tolerance of Software Faults
,
1995
.
[5]
Robyn R. Lutz,et al.
Requirements analysis using forward and backward search
,
1997,
Ann. Softw. Eng..
[6]
Clifton A. Ericson,et al.
Hazard Analysis Techniques for System Safety
,
2005
.
[7]
Kai-Yuan Cai,et al.
Towards a control-theoretical approach to software fault-tolerance
,
2004,
Fourth International Conference onQuality Software, 2004. QSIC 2004. Proceedings..
[8]
Gary T. Leavens,et al.
How the design of JML accommodates both runtime assertion checking and formal verification
,
2003,
Sci. Comput. Program..
[9]
Dyadem Press.
Guidelines for Failure Mode and Effects Analysis (FMEA), for Automotive, Aerospace, and General Manufacturing Industries
,
2003
.
[10]
R. Kumar,et al.
Safety and transition-structure preserving abstraction of hybrid systems with inputs/outputs
,
2008,
2008 9th International Workshop on Discrete Event Systems.