Distribution-based anomaly detection via generalized likelihood ratio test: A general Maximum Entropy approach

We address the problem of detecting "anomalies" in the network traffic produced by a large population of end-users following a distribution-based change detection approach. In the considered scenario, different traffic variables are monitored at different levels of temporal aggregation (timescales), resulting in a grid of variable/timescale nodes. For every node, a set of per-user traffic counters is maintained and then summarized into histograms for every time bin, obtaining a timeseries of empirical (discrete) distributions for every variable/timescale node. Within this framework, we tackle the problem of designing a formal Distribution-based Change Detector (DCD) able to identify statistically-significant deviations from the past behavior of each individual timeseries.For the detection task we propose a novel methodology based on a Maximum Entropy (ME) modeling approach. Each empirical distribution (sample observation) is mapped to a set of ME model parameters, called "characteristic vector", via closed-form Maximum Likelihood (ML) estimation. This allows to derive a detection rule based on a formal hypothesis test (Generalized Likelihood Ratio Test, GLRT) to measure the coherence of the current observation, i.e., its characteristic vector, to the given reference. The latter is dynamically identified taking into account the typical non-stationarity displayed by real network traffic. Numerical results on synthetic data demonstrates the robustness of our detector, while the evaluation on a labeled dataset from an operational 3G cellular network confirms the capability of the proposed method to identify real traffic anomalies.

[1]  Mark Crovella,et al.  Mining anomalies using traffic feature distributions , 2005, SIGCOMM '05.

[2]  Silviu Guiasu,et al.  The principle of maximum entropy , 1985 .

[3]  Rob Malouf,et al.  A Comparison of Algorithms for Maximum Entropy Parameter Estimation , 2002, CoNLL.

[4]  S. Venkatasubramanian,et al.  An Information-Theoretic Approach to Detecting Changes in Multi-Dimensional Data Streams , 2006 .

[5]  Mark Burgess,et al.  Measuring system normality , 2002, TOCS.

[6]  E. Jaynes,et al.  NOTES ON PRESENT STATUS AND FUTURE PROSPECTS , 1991 .

[7]  Dietrich Stoyan,et al.  Statistical Physics and Spatial Statistics: The Art Of Analyzing And Modeling Spatial Structures And Pattern Formation , 2010 .

[8]  Han-Chieh Chao,et al.  Transaction-Pattern-Based Anomaly Detection Algorithm for IP Multimedia Subsystem , 2011, IEEE Transactions on Information Forensics and Security.

[9]  Anja Feldmann,et al.  A non-instrusive, wavelet-based approach to detecting network performance problems , 2001, IMW '01.

[10]  Mehmet Celenk,et al.  Predictive Network Anomaly Detection and Visualization , 2010, IEEE Transactions on Information Forensics and Security.

[11]  J. Darroch,et al.  Generalized Iterative Scaling for Log-Linear Models , 1972 .

[12]  Darryl Veitch,et al.  On the Role of Flows and Sessions in Internet Traffic Modeling: An Explorative Toy-Model , 2009, GLOBECOM 2009 - 2009 IEEE Global Telecommunications Conference.

[13]  Naftali Tishby,et al.  Is Feature Selection Still Necessary? , 2005, SLSFS.

[14]  Konstantina Papagiannaki,et al.  Structural analysis of network traffic flows , 2004, SIGMETRICS '04/Performance '04.

[15]  Dong Xiang,et al.  Information-theoretic measures for anomaly detection , 2001, Proceedings 2001 IEEE Symposium on Security and Privacy. S&P 2001.

[16]  Jung-Min Park,et al.  An overview of anomaly detection techniques: Existing solutions and latest technological trends , 2007, Comput. Networks.

[17]  Donald F. Towsley,et al.  Detecting anomalies in network traffic using maximum entropy estimation , 2005, IMC '05.

[18]  John D. Lafferty,et al.  Inducing Features of Random Fields , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[19]  Patrick D. McDaniel,et al.  On Attack Causality in Internet-Connected Cellular Networks , 2007, USENIX Security Symposium.

[20]  Harry L. Van Trees,et al.  Detection, Estimation, and Modulation Theory: Radar-Sonar Signal Processing and Gaussian Signals in Noise , 1992 .

[21]  Songwu Lu,et al.  Securing a Wireless World , 2006, Proceedings of the IEEE.

[22]  Daphne Koller,et al.  Toward Optimal Feature Selection , 1996, ICML.

[23]  E. Jaynes Information Theory and Statistical Mechanics , 1957 .

[24]  Christophe Diot,et al.  Diagnosing network-wide traffic anomalies , 2004, SIGCOMM.

[25]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[26]  Angelo Coluccia,et al.  Distribution-based anomaly detection in 3G mobile networks: from theory to practice , 2010, Int. J. Netw. Manag..

[27]  Artur Ziviani,et al.  Network anomaly detection using nonextensive entropy , 2007, IEEE Communications Letters.

[28]  Tao Qin,et al.  Dynamic Feature Analysis and Measurement for Large-Scale Network Traffic Monitoring , 2010, IEEE Transactions on Information Forensics and Security.

[29]  Kang G. Shin,et al.  Detecting SYN flooding attacks , 2002, Proceedings.Twenty-First Annual Joint Conference of the IEEE Computer and Communications Societies.

[30]  Fabio Ricciato,et al.  Distribution-Based Anomaly Detection in Network Traffic , 2013, Data Traffic Monitoring and Analysis.

[31]  Marina Thottan,et al.  Anomaly Detection Approaches for Communication Networks , 2010, Algorithms for Next Generation Networks.

[32]  Wanlei Zhou,et al.  Low-Rate DDoS Attacks Detection and Traceback by Using New Information Metrics , 2011, IEEE Transactions on Information Forensics and Security.

[33]  R. A. Leibler,et al.  On Information and Sufficiency , 1951 .

[34]  Kavé Salamatian,et al.  Combining filtering and statistical methods for anomaly detection , 2005, IMC '05.

[35]  E. Jaynes On the rationale of maximum-entropy methods , 1982, Proceedings of the IEEE.

[36]  Fabio Ricciato,et al.  A review of DoS attack models for 3G cellular networks from a system-design perspective , 2010, Comput. Commun..

[37]  Anja Feldmann,et al.  On dominant characteristics of residential broadband internet traffic , 2009, IMC '09.

[38]  Hans-Otto Georgii,et al.  Gibbs Measures and Phase Transitions , 1988 .

[39]  Yannis A. Dimitriadis,et al.  Anomaly Detection in Network Traffic Based on Statistical Inference and \alpha-Stable Modeling , 2011, IEEE Transactions on Dependable and Secure Computing.

[40]  Vladik Kreinovich,et al.  Why is Selecting the Simplest Hypothesis (Consistent with Data) a Good Idea? A Simple Explanation , 2002, Bull. EATCS.

[41]  Paul Barford,et al.  A signal analysis of network traffic anomalies , 2002, IMW '02.

[42]  Patrick P. C. Lee,et al.  On the Detection of Signaling DoS Attacks on 3G Wireless Networks , 2007, IEEE INFOCOM 2007 - 26th IEEE International Conference on Computer Communications.

[43]  Paul M. B. Vitányi,et al.  The miraculous universal distribution , 1997 .

[44]  Thomas F. La Porta,et al.  Security for Telecommunications Networks , 2008, Advances in Information Security.

[45]  H. V. Trees Detection, Estimation, And Modulation Theory , 2001 .

[46]  R. Sekar,et al.  Specification-based anomaly detection: a new approach for detecting network intrusions , 2002, CCS '02.

[47]  Lewis H. Roberts A discipline for the avoidance of unnecessary assumptions , 1971, ASTIN Bulletin.

[48]  Jennifer Rexford,et al.  Sensitivity of PCA for traffic anomaly detection , 2007, SIGMETRICS '07.

[49]  Fabio Ricciato,et al.  A Distribution-Based Approach to Anomaly Detection and Application to 3G Mobile Traffic , 2009, GLOBECOM 2009 - 2009 IEEE Global Telecommunications Conference.