Distributed Fault Diagnosis Using Bayesian Reasoning in MAGNETO

Many of the emerging telecom services make use of Outer Edge Networks, in particular Home Area Networks. The configuration and maintenance of such services may not be under full control of the telecom operator which still needs to guarantee the service quality experienced by the consumer. Diagnosing service faults in these scenarios becomes especially difficult since there may be not full visibility between different domains. This paper describes the fault diagnosis solution developed in the MAGNETO project, based on the application of Bayesian Inference to deal with the uncertainty. It also takes advantage of a distributed framework to deploy diagnosis components in the different domains and network elements involved, spanning both the telecom operator and the Outer Edge networks. In addition, MAGNETO features self-learning capabilities to automatically improve diagnosis knowledge over time and a partition mechanism that allows breaking down the overall diagnosis knowledge into smaller subsets. The MAGNETO solution has been prototyped and adapted to a particular outer edge scenario, and has been further validated on a real testbed. Evaluation of the results shows the potential of our approach to deal with fault management of outer edge networks.

[1]  David Poole,et al.  MULTIPLY SECTIONED BAYESIAN NETWORKS AND JUNCTION FORESTS FOR LARGE KNOWLEDGE‐BASED SYSTEMS , 1993, Comput. Intell..

[2]  Sidath Handurukande,et al.  Design of a HAN Autonomic Control Loop , 2010, MACE.

[3]  Yun Peng,et al.  Belief Update in Bayesian Networks Using Uncertain Evidence , 2006, 2006 18th IEEE International Conference on Tools with Artificial Intelligence (ICTAI'06).

[4]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[5]  Naixiang Li,et al.  An EM-MCMC algorithm for Bayesian structure learning , 2009, 2009 2nd IEEE International Conference on Computer Science and Information Technology.

[6]  Martin Zach,et al.  Probabilistic Fault Diagnosis in the MAGNETO Autonomic Control Loop , 2010, AIMS.

[7]  Neil Bartlett OSGi In Practice , 2009 .

[8]  Álvaro Carrera,et al.  A Multi-Agent System with Distributed Bayesian Reasoning for Network Fault Diagnosis , 2011, PAAMS.

[9]  Agostino Poggi,et al.  Developing Multi-agent Systems with JADE , 2007, ATAL.

[10]  Lisandro Zambenedetti Granville,et al.  Consistency maintenance of policy states in decentralized autonomic network management , 2010, 2010 IEEE Network Operations and Management Symposium - NOMS 2010.

[11]  Du Juan,et al.  Design of Distributed Network Management System Based on Multi-agent , 2010, 2010 Third International Symposium on Information Processing.

[12]  Álvaro Carrera,et al.  A Lightweight Approach to Distributed Network Diagnosis under Uncertainty , 2009, 2009 International Conference on Intelligent Networking and Collaborative Systems.

[13]  Daili Zhang,et al.  Multi-agent based control of large-scale complex systems employing distributed dynamic inference engine , 2010 .

[14]  Chung-Hua Hu,et al.  Home network management for IPTV service operations — A service provider perspective , 2010, 2010 IEEE/IFIP Network Operations and Management Symposium Workshops.