Topology-Aware Correlated Network Anomaly Event Detection and Diagnosis

For purposes such as end-to-end monitoring, capacity planning, and performance bottleneck troubleshooting across multi-domain networks, there is an increasing trend to deploy interoperable measurement frameworks such as perfSONAR. These deployments expose vast data archives of current and historic measurements, which can be queried using web services. Analysis of these measurements using effective schemes to detect and diagnose anomaly events is vital since it allows for verifying if network behavior meets expectations. In addition, it allows for proactive notification of bottlenecks that may be affecting a large number of users. In this paper, we describe our novel topology-aware scheme that can be integrated into perfSONAR deployments for detection and diagnosis of network-wide correlated anomaly events. Our scheme involves spatial and temporal analyses on combined topology and uncorrelated anomaly events information for detection of correlated anomaly events. Subsequently, a set of ‘filters’ are applied on the detected events to prioritize them based on potential severity, and to drill-down upon the events “nature” (e.g., event burstiness) and “root-location(s)” (e.g., edge or core location affinity). To validate our scheme, we use traceroute information and one-way delay measurements collected over 3 months between the various U.S. Department of Energy national lab network locations, published via perfSONAR web services. Further, using real-world case studies, we show how our scheme can provide helpful insights for detection, visualization and diagnosis of correlated network anomaly events, and can ultimately save time, effort, and costs spent on network management.

[1]  D. Martin Swany,et al.  A scalable framework for representation and exchange of network measurements , 2006, 2nd International Conference on Testbeds and Research Infrastructures for the Development of Networks and Communities, 2006. TRIDENTCOM 2006..

[2]  Antonio Pescapè,et al.  Topology Discovery at the Router Level: A New Hybrid Tool Targeting ISP Networks , 2011, IEEE Journal on Selected Areas in Communications.

[3]  Antonio Pescapè,et al.  Worm Traffic Analysis and Characterization , 2007, 2007 IEEE International Conference on Communications.

[4]  Puneet Sharma,et al.  Correlations in end-to-end network metrics: Impact on large scale network monitoring , 2008, IEEE INFOCOM Workshops 2008.

[5]  Jiri Navratil,et al.  Experiences in traceroute and available bandwidth change analysis , 2004, NetT '04.

[6]  A. Barabasi,et al.  Quantifying social group evolution , 2007, Nature.

[7]  Kavé Salamatian,et al.  Anomaly extraction in backbone networks using association rules , 2012, TNET.

[8]  Yin Zhang,et al.  Network-wide Information Correlation and Exploration ( NICE ) : Framework , Applications , and Experience , 2022 .

[9]  D. Martin Swany,et al.  PerfSONAR: A Service Oriented Architecture for Multi-domain Network Monitoring , 2005, ICSOC.

[10]  Sujata Banerjee,et al.  Leveraging Correlations between Capacity and Available Bandwidth to Scale Network Monitoring , 2010, 2010 IEEE Global Telecommunications Conference GLOBECOM 2010.

[11]  Kavé Salamatian,et al.  Combining filtering and statistical methods for anomaly detection , 2005, IMC '05.

[13]  Prasad Calyam,et al.  Active and passive measurements on campus, regional and national network backbone paths , 2005, Proceedings. 14th International Conference on Computer Communications and Networks, 2005. ICCCN 2005..

[14]  D. Martin Swany,et al.  Hierarchically Federated Registration and Lookup within the perfSONAR Framework , 2007, 2007 10th IFIP/IEEE International Symposium on Integrated Network Management.

[15]  Antonio Pescapè,et al.  Detecting third-party addresses in traceroute IP paths , 2012, SIGCOMM '12.

[16]  Les Cottrell,et al.  The PingER project: active Internet performance monitoring for the HENP community , 2000, IEEE Commun. Mag..

[17]  Mark Crovella,et al.  Diagnosing network-wide traffic anomalies , 2004, SIGCOMM '04.

[18]  Yingjie Zhou,et al.  Network-Wide Anomaly Detection Based on Router Connection Relationships , 2011, IEICE Trans. Commun..

[19]  Mudhakar Srivatsa,et al.  Spatio-temporal patterns in network events , 2010, Co-NEXT '10.

[20]  Antonio Pescapè,et al.  Analysis of a "/0" stealth scan from a botnet , 2015, TNET.

[21]  Yin Zhang,et al.  Troubleshooting chronic conditions in large IP networks , 2008, CoNEXT '08.

[22]  Emden R. Gansner,et al.  Graphviz and Dynagraph – Static and Dynamic Graph Drawing Tools , 2003 .

[23]  Kavé Salamatian,et al.  Anomaly extraction in backbone networks using association rules , 2009, IMC '09.

[24]  Å Blockin AUTOMATED EVENT DETECTION FOR ACTIVE MEASUREMENT SYSTEMSevent dete , 2001 .

[25]  Xiapu Luo,et al.  Non-cooperative Diagnosis of Submarine Cable Faults , 2011, PAM.

[26]  A. Hanemann,et al.  Complementary Visualization of perfSONAR Network Performance Measurements , 2006, International Conference on Internet Surveillance and Protection (ICISP’06).

[27]  Dan Yang,et al.  Detecting Distributed Network Traffic Anomaly with Network-Wide Correlation Analysis , 2009, EURASIP J. Adv. Signal Process..

[28]  Abhijit Mitra,et al.  Graph theoretic approach for studying correlated motions in biomolecules , 2009, 2009 World Congress on Nature & Biologically Inspired Computing (NaBIC).

[29]  Jason Lee,et al.  Intra and Interdomain Circuit Provisioning Using the OSCARS Reservation System , 2006, 2006 3rd International Conference on Broadband Communications, Networks and Systems.

[30]  Paul Barford,et al.  Network anomaly confirmation, diagnosis and remediation , 2009, 2009 47th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[31]  T. Wlodek,et al.  Monitoring the US ATLAS Network Infrastructure with perfSONAR-PS , 2012 .

[32]  Prasad Calyam,et al.  OnTimeDetect: Dynamic Network Anomaly Notification in perfSONAR Deployments , 2010, 2010 IEEE International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems.

[33]  Philippe Owezarski,et al.  Automated Classification of Network Traffic Anomalies , 2009, SecureComm.

[34]  Brian Tierney,et al.  E-Center: A Collaborative Platform for Wide Area Network Users , 2012 .

[35]  Jeff R. Allen Driving via the Rear-View Mirror: Managing a Network with Cricket , 1999, NETA.