Combining filtering and statistical methods for anomaly detection

In this work we develop an approach for anomaly detection for large scale networks such as that of an enterprize or an ISP. The traffic patterns we focus on for analysis are that of a network-wide view of the traffic state, called the traffic matrix. In the first step a Kalman filter is used to filter out the "normal" traffic. This is done by comparing our future predictions of the traffic matrix state to an inference of the actual traffic matrix that is made using more recent measurement data than those used for prediction. In the second step the residual filtered process is then examined for anomalies. We explain here how any anomaly detection method can be viewed as a problem in statistical hypothesis testing. We study and compare four different methods for analyzing residuals, two of which are new. These methods focus on different aspects of the traffic pattern change. One focuses on instantaneous behavior, another focuses on changes in the mean of the residual process, a third on changes in the variance behavior, and a fourth examines variance changes over multiple timescales. We evaluate and compare all of these methods using ROC curves that illustrate the full tradeoff between false positives and false negatives for the complete spectrum of decision thresholds.

[1]  James P. Egan,et al.  Signal detection theory and ROC analysis , 1975 .

[2]  M. Zweig,et al.  Receiver-operating characteristic (ROC) plots: a fundamental evaluation tool in clinical medicine. , 1993, Clinical chemistry.

[3]  Michèle Basseville,et al.  Detection of abrupt changes: theory and application , 1993 .

[4]  S. Mallat A wavelet tour of signal processing , 1998 .

[5]  Paul Barford,et al.  Characteristics of network traffic flow anomalies , 2001, IMW '01.

[6]  Balachander Krishnamurthy,et al.  Flash crowds and denial of service attacks: characterization and implications for CDNs and web sites , 2002, WWW.

[7]  Paul Barford,et al.  A signal analysis of network traffic anomalies , 2002, IMW '02.

[8]  Albert G. Greenberg,et al.  Fast accurate computation of large-scale IP traffic matrices from link loads , 2003, SIGMETRICS '03.

[9]  Douglas M. Hawkins,et al.  The Changepoint Model for Statistical Process Control , 2003 .

[10]  Carsten Lund,et al.  An information-theoretic approach to traffic matrix estimation , 2003, SIGCOMM '03.

[11]  Vinod Yegneswaran,et al.  A framework for malicious workload generation , 2004, IMC '04.

[12]  Athina Markopoulou,et al.  Characterization of failures in an IP backbone , 2004, IEEE INFOCOM 2004.

[13]  Emilio Leonardi,et al.  How to identify and estimate the largest traffic matrix elements in a dynamic environment , 2004, SIGMETRICS '04/Performance '04.

[14]  Mikael Johansson,et al.  Traffic matrix estimation on a large IP backbone: a comparison on real data , 2004, IMC '04.

[15]  Mark Crovella,et al.  Diagnosing network-wide traffic anomalies , 2004, SIGCOMM '04.

[16]  Konstantina Papagiannaki,et al.  Structural analysis of network traffic flows , 2004, SIGMETRICS '04/Performance '04.

[17]  Peter Reiher,et al.  A taxonomy of DDoS attack and DDoS defense mechanisms , 2004, CCRV.

[18]  Mark Crovella,et al.  Characterization of network-wide anomalies in traffic flows , 2004, IMC '04.

[19]  Christophe Diot,et al.  Diagnosing network-wide traffic anomalies , 2004, SIGCOMM.

[20]  Kavé Salamatian,et al.  Traffic matrix tracking using Kalman filters , 2005, PERV.

[21]  Matthew Roughan,et al.  Traffic Matrix Reloaded: Impact of Routing Changes , 2005, PAM.

[22]  Alefiya Hussain,et al.  Measurement and spectral analysis of denial of service attacks , 2005 .

[23]  Konstantina Papagiannaki,et al.  Traffic matrices: balancing measurements, inference and modeling , 2005, SIGMETRICS '05.

[24]  Stefan Savage,et al.  Inferring Internet denial-of-service activity , 2001, TOCS.