Anomaly Detection in Partially Observed Traffic Networks

This paper addresses the problem of detecting anomalous activity in traffic networks where the network is not directly observed. Given knowledge of what the node-to-node traffic in a network should be, any activity that differs significantly from this baseline would be considered anomalous. We propose a Bayesian hierarchical model for estimating the traffic rates and detecting anomalous changes in the network. The probabilistic nature of the model allows us to perform statistical goodness-of-fit tests to detect significant deviations from a baseline network. We show that due to the more defined structure of the hierarchical Bayesian model, such tests perform well even when the empirical models estimated by the EM algorithm are misspecified. We apply our model to both simulated and real datasets to demonstrate its superior performance over existing alternatives.

[1]  Marina Thottan,et al.  Anomaly detection in IP networks , 2003, IEEE Trans. Signal Process..

[2]  Balachander Krishnamurthy,et al.  Sketch-based change detection: methods, evaluation, and applications , 2003, IMC '03.

[3]  Albert G. Greenberg,et al.  Network anomography , 2005, IMC '05.

[4]  Sumio Watanabe Algebraic Geometry and Statistical Learning Theory , 2009 .

[5]  Solomon Kullback,et al.  Information Theory and Statistics , 1960 .

[6]  Steve Harenberg,et al.  Anomaly detection in dynamic networks: a survey , 2015 .

[7]  Michael A. West,et al.  Bayesian Inference on Network Traffic Using Link Count Data , 1998 .

[8]  S. S. Wilks The Large-Sample Distribution of the Likelihood Ratio for Testing Composite Hypotheses , 1938 .

[9]  Robert Nowak,et al.  Internet tomography , 2002, IEEE Signal Process. Mag..

[10]  Jin Cao,et al.  A Scalable Method for Estimating Network Traffic Matrices from Link Counts , 2007 .

[11]  渡邊 澄夫 Algebraic geometry and statistical learning theory , 2009 .

[12]  Kevin P. Murphy,et al.  Machine learning - a probabilistic perspective , 2012, Adaptive computation and machine learning series.

[13]  Mark Crovella,et al.  Characterization of network-wide anomalies in traffic flows , 2004, IMC '04.

[14]  James J. Iannone,et al.  An EM Approach to OD Matrix Estimation , 1998 .

[15]  Mark Crovella,et al.  Diagnosing network-wide traffic anomalies , 2004, SIGCOMM '04.

[16]  Ramesh Govindan,et al.  Detection and identification of network anomalies using sketch subspaces , 2006, IMC '06.

[17]  Yuh-Jye Lee,et al.  Anomaly Detection via Online Oversampling Principal Component Analysis , 2013, IEEE Transactions on Knowledge and Data Engineering.

[18]  Danai Koutra,et al.  Graph based anomaly detection and description: a survey , 2014, Data Mining and Knowledge Discovery.

[19]  Christophe Diot,et al.  Traffic matrix estimation: existing techniques and new directions , 2002, SIGCOMM 2002.

[20]  Alfred O. Hero,et al.  Hierarchical Inference of Unicast Network Topologies Based on End-to-End Measurements , 2007, IEEE Transactions on Signal Processing.

[21]  Y. Vardi,et al.  Network Tomography: Estimating Source-Destination Traffic Intensities from Link Data , 1996 .

[22]  Donald F. Towsley,et al.  Multicast-based inference of network-internal loss characteristics , 1999, IEEE Trans. Inf. Theory.

[23]  Antonio Ortega,et al.  Spectral anomaly detection using graph-based filtering for wireless sensor networks , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[24]  Robert D. Nowak,et al.  Network delay tomography , 2003, IEEE Trans. Signal Process..

[25]  Vijayan N. Nair,et al.  Network tomography: A review and recent developments , 2006 .

[26]  Ngai-Man Cheung,et al.  Sparse Laplacian Component Analysis for Internet Traffic Anomalies Detection , 2018, IEEE Transactions on Signal and Information Processing over Networks.

[27]  Carsten Lund,et al.  An information-theoretic approach to traffic matrix estimation , 2003, SIGCOMM '03.

[28]  Jennifer Rexford,et al.  Sensitivity of PCA for traffic anomaly detection , 2007, SIGMETRICS '07.

[29]  Ioannis Ch. Paschalidis,et al.  Statistical Anomaly Detection via Composite Hypothesis Testing for Markov Models , 2017, IEEE Transactions on Signal Processing.

[30]  Robert Nowak,et al.  Network Tomography: Recent Developments , 2004 .

[31]  Sanjib Basu Bayesian hypotheses testing using posterior density ratios , 1996 .

[32]  Robert D. Nowak,et al.  Maximum likelihood network topology identification from edge-based unicast measurements , 2002, SIGMETRICS '02.

[33]  Morteza Mardani,et al.  Estimating Traffic and Anomaly Maps via Network Tomography , 2014, IEEE/ACM Transactions on Networking.

[34]  Alexander J. Smola,et al.  Unifying Divergence Minimization and Statistical Inference Via Convex Duality , 2006, COLT.

[35]  Donald F. Towsley,et al.  Detecting anomalies in network traffic using maximum entropy estimation , 2005, IMC '05.

[36]  Walter Willinger,et al.  Spatio-Temporal Compressive Sensing and Internet Traffic Matrices (Extended Version) , 2012, IEEE/ACM Transactions on Networking.

[37]  Robert D. Nowak,et al.  Multiple source, multiple destination network tomography , 2004, IEEE INFOCOM 2004.

[38]  Constantine Caramanis,et al.  Regularized EM Algorithms: A Unified Framework and Statistical Guarantees , 2015, NIPS.

[39]  Alejandro Zunino,et al.  An empirical comparison of botnet detection methods , 2014, Comput. Secur..

[40]  O. Koyejo,et al.  A Representation Approach for Relative Entropy Minimization with Expectation Constraints , 2013 .

[41]  Albert G. Greenberg,et al.  Fast accurate computation of large-scale IP traffic matrices from link loads , 2003, SIGMETRICS '03.

[42]  Wolfgang Kellerer,et al.  Anomaly Detection and Identification in Large-scale Networks based on Online Time-structured Traffic Tensor Tracking , 2016 .

[43]  B. Yu,et al.  Time-varying network tomography: router link data , 2000, 2000 IEEE International Symposium on Information Theory (Cat. No.00CH37060).

[44]  Morteza Mardani,et al.  Recovery of Low-Rank Plus Compressed Sparse Matrices With Application to Unveiling Traffic Anomalies , 2012, IEEE Transactions on Information Theory.

[45]  Alfred O. Hero,et al.  Unicast-based inference of network link delay distributions with finite mixture models , 2003, IEEE Trans. Signal Process..

[46]  Christophe Diot,et al.  Diagnosing network-wide traffic anomalies , 2004, SIGCOMM.

[47]  Nick G. Duffield,et al.  Network Tomography of Binary Network Performance Characteristics , 2006, IEEE Transactions on Information Theory.