Network failure detection based on correlation data analysis

Abstract For a telecommunication operator, the effective detection of access and aggregation network failures is the key to providing continuous service. Although there are many monitoring systems on the market, analysis has shown that there is no possibility of automatically detecting all failures using standard monitoring systems. In this article, an innovative option for failure detection is proposed, based on correlation analysis of data retrieved in real time from the network. A key source of data is the Remote Authentication Dial-in User Service (RADIUS) which records events giving information about a user’s Point-to-Point Protocol (PPP) session state sent to the monitoring system. It is shown here that the proposed solution enables an operator to detect all network events affecting customers. The detection of a greater number of events enables the operator to react quickly to them and to restore services to users as soon as possible, which ultimately improves the quality and continuity of provided services.

[1]  Shusen Yang,et al.  A novel temporal perturbation based privacy-preserving scheme for real-time monitoring systems , 2015, Comput. Networks.

[2]  Krzysztof Zielinski,et al.  Goal-driven adaptive monitoring of SOA systems , 2015, J. Syst. Softw..

[3]  Hyong S. Kim,et al.  Network monitoring: Present and future , 2014, Comput. Networks.

[4]  Zhongmin Cai,et al.  Model-based real-time volume control for interactive network traffic replay , 2012, 2012 IEEE Network Operations and Management Symposium.

[5]  Aleksandrs Slivkins,et al.  Network failure detection and graph connectivity , 2004, SODA '04.

[6]  Shi Qian,et al.  Evaluation of network resilience, survivability, and disruption tolerance: analysis, topology generation, simulation, and experimentation , 2013, Telecommun. Syst..

[7]  Janos Gertler,et al.  A new structural framework for parity equation-based failure detection and isolation , 1990, Autom..

[8]  Noël Crespi,et al.  Large-scale mobile phenomena monitoring with energy-efficiency in wireless sensor networks , 2015, Comput. Networks.

[9]  Sangkyun Kim,et al.  High performance AAA architecture for massive IPv4 networks , 2007, Future Gener. Comput. Syst..

[10]  Jacek Rak Resilient Routing in Communication Networks , 2015, Computer Communications and Networks.

[11]  Paul Grünbacher,et al.  ReMinds : A flexible runtime monitoring framework for systems of systems , 2016, J. Syst. Softw..

[12]  Wu-chi Feng,et al.  Achieving faster failure detection in OSPF networks , 2003, IEEE International Conference on Communications, 2003. ICC '03..

[13]  Marshall T. Rose,et al.  Management Information Base for network management of TCP/IP-based internets , 1990, RFC.

[14]  Roberto Rojas-Cessa,et al.  Task-execution scheduling schemes for network measurement and monitoring , 2010, Comput. Commun..

[15]  Rodolfo E. Haber,et al.  Self-adaptive systems: A survey of current approaches, research challenges and applications , 2013, Expert Syst. Appl..

[16]  Jacek Rak,et al.  Principles of Communication Networks Resilience , 2015 .

[17]  Jean-Luc Gaudiot,et al.  Network Resilience: A Measure of Network Fault Tolerance , 1990, IEEE Trans. Computers.

[18]  Vern Paxson,et al.  Bro: a system for detecting network intruders in real-time , 1998, Comput. Networks.

[19]  Vaishali U. Patil,et al.  Real Time Alert Data Acquisition system USING Dynamic IP Embedded Webserver by USB Modem , 2015 .