An Approach to Diagnosability Analysis for Interacting Finite State Systems

Fault isolation is the process of reasoning required to find the cause of a system failure. In a model-based approach, the available information is a model of the system and some observations. Using knowledge of how the system generally behaves, as given in the system model, together with partial observations of the events of the current situation the task is to deduce the failure causing event(s). In our setting, the observable events manifest themselves in a message log. We study post mortem fault isolation for moderately concurrent discrete event systems where the temporal order of logged messages contains little information. To carry out fault isolation one has to study the correlation between observed events and fault events of the system. In general, such study calls for exploration of the state space of the system, which is exponential in the number of system components. Since we are studying a restricted class of all possible systems we may apply aggressive specialized abstraction policies in order to allow fault isolation without ever considering the often intractably large state space of the system. In this thesis we describe a mathematical framework as well as a prototype implementation and an experimental evaluation of such abstraction techniques. The method is efficient enough to allow for not only post mortem fault isolation but also design time diagnosability analysis of the system, which can be seen as a non-trivial way of analyzing all possible observations of the system versus the corresponding fault isolation outcome. This work has been supported by VINNOVA’s Competence Center ISIS.