APPLICATION OF BAYESIAN REASONING TECHNIQUES TO FAULT LOCALIZATION IN FCS NETWORKS

FCS networks are aimed at providing a highly automated, secure, and survivable paradigm of battlefield operations. This goal cannot be achieved without an ability to rapidly isolate and correct network faults. A fault management system for FCS networks, which are ad-hoc and mobile, should be characterized by high accuracy and efficiency as well as the ability to deal with uncertainty, unreliability, and dynamism – inherent properties of such networks. We propose an application of Bayesian reasoning techniques to fault localization in FCS networks and present a fault localization algorithm capable of identifying multiple simultaneous faults in an efficient and event-driven manner. The algorithm provides an accurate fault hypothesis in the presence of uncertain information about the system structure and is resilient to noise in observed symptoms. We evaluate the algorithm through simulation in which its accuracy and performance are assessed in identifying root causes of end-to-end connectivity problems. 1

[1]  Malgorzata Steinder,et al.  Increasing robustness of fault localization through analysis of lost, spurious, and positive symptoms , 2002, Proceedings.Twenty-First Annual Joint Conference of the IEEE Computer and Communications Societies.

[2]  D. Ohsie,et al.  High speed and robust event correlation , 1996, IEEE Commun. Mag..

[3]  Malgorzata Steinder,et al.  Distributed Fault Localization in Hierarchically Routed Networks , 2002, DSOM.

[4]  Malgorzata Steinder,et al.  FAULT LOCALIZATION AND SELF-HEALING MECHANISMS FOR FCS NETWORKS 1 , 2002 .

[5]  Malgorzata Steinder,et al.  Non-deterministic Event-driven Fault Diagnosis through Incremental Hypothesis Updating , 2003 .

[6]  Malgorzata Steinder,et al.  End-to-end service failure diagnosis using belief networks , 2002, NOMS 2002. IEEE/IFIP Network Operations and Management Symposium. ' Management Solutions for the New Communications World'(Cat. No.02CH37327).

[7]  Malgorzata Steinder,et al.  Non-deterministic fault localiza-tion in communication systems using belief networks , 2002 .

[8]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[9]  G. Jakobson,et al.  Alarm correlation , 1993, IEEE Network.

[10]  Rina Dechter,et al.  Bucket elimination: A unifying framework for probabilistic inference , 1996, UAI.

[11]  Mischa Schwartz,et al.  Schemes for fault identification in communication networks , 1995, TNET.

[12]  Ramesh Viswanathan,et al.  A conceptual framework for network management event correlation and filtering systems , 1999, Integrated Network Management VI. Distributed Management for the Networked Millennium. Proceedings of the Sixth IFIP/IEEE International Symposium on Integrated Network Management. (Cat. No.99EX302).

[13]  C.S. Chao,et al.  An Automated Fault Diagnosis System Using Hierarchical Reasoning and Alarm Correlation , 1999, Proceedings 1999 IEEE Workshop on Internet Applications (Cat. No.PR00197).