Distributed fault diagnostics for tactical networks

We present a design and an evaluation of a distributed fault diagnostic system (FDS) that copes with changing wireless network topology, complexity and size of fault propagation patterns, constrained bandwidth, and limited computing power of the mobile devices. The presented FDS consists of several components: run-time synthesis algorithm to generate network-wide fault dependency model (FPM), scalable Bayesian inference algorithms, and novel techniques for optimally distributing inference to ensure the scalability of our approach. We describe three algorithms for distributing inference, each of them using different technique for maximizing the fault-symptom locality: Fault-based Adaptive algorithm, Topology-based Adaptive algorithm, and Topology-based Probabilistic algorithm. We have evaluated the performance of the proposed approach in a simulated environment using abstract models of a real-life tactical network, and compared it to a centralized approach. We found that our techniques allows for a significant gain in the processing time (30 times improvement for the best performing technique), and exhibit only a minimal reduction (3% percentage points) in the accuracy of the fault diagnostics.

[1]  D. Zager,et al.  Value-oriented network management , 2000, NOMS 2000. 2000 IEEE/IFIP Network Operations and Management Symposium 'The Networked Planet: Management Beyond 2000' (Cat. No.00CB37074).

[2]  Malgorzata Steinder,et al.  Probabilistic fault diagnosis in communication systems through incremental hypothesis updating , 2004, Comput. Networks.

[3]  Behrouz Homayoun Far,et al.  A Framework for Network Fault Management Using Software Agents , 2004, IEICE Trans. Inf. Syst..

[4]  Malgorzata Steinder,et al.  A survey of fault localization techniques in computer networks , 2004, Sci. Comput. Program..