Unsupervised Ensemble Anomaly Detection through Time-Periodical Packet Sampling

We propose an anomaly detection method that trains a baseline model describing the normal behavior of network traffic without using manually labeled traffic data. The trained baseline distribution is used as the basis for comparison with the audit network traffic. The proposed method can be carried out in an unsupervised manner through the use of time-periodical packet sampling for a different purpose from which it was intended. That is, we take advantage of the lossy nature of packet sampling for the purpose of extracting normal packets from the unlabeled original traffic data. By using real traffic traces, we show that the proposed method is comparable in terms of false positive and false negative rates on detecting anomalies regarding TCP SYN packets to the conventional method that requires manually labeled traffic data to train the baseline model. In addition, in order to mitigate the possible performance variation due to probabilistic nature of sampled traffic data, we devised an ensemble anomaly detection method that exploits multiple baseline models in parallel. Experimental results show that the proposed ensemble anomaly detection performs well and is not affected by the variability of time-periodical packet sampling.

[1]  Shigeo Shioda,et al.  Fixed-Period Packet Sampling and its Application to Flow Rate Estimation , 2007, 2007 IEEE International Conference on Communications.

[2]  Shigeki Goto,et al.  Identifying Heavy-Hitter Flows from Sampled Flow Statistics , 2007, IEICE Trans. Commun..

[3]  Symeon Papavassiliou,et al.  Network anomaly detection and classification via opportunistic sampling , 2009, IEEE Network.

[4]  Tilman Wolf,et al.  Accurate anomaly detection through parallelism , 2009, IEEE Network.

[5]  Vasilios A. Siris,et al.  Application of anomaly detection algorithms for detecting SYN flooding attacks , 2004, IEEE Global Telecommunications Conference, 2004. GLOBECOM '04..

[6]  M. Uchida,et al.  An extended formula for divergence measures using invariance , 2005 .

[7]  R. Schapire The Strength of Weak Learnability , 1990, Machine Learning.

[8]  Mark Crovella,et al.  Mining anomalies using traffic feature distributions , 2005, SIGCOMM '05.

[9]  Paul Barford,et al.  A signal analysis of network traffic anomalies , 2002, IMW '02.

[10]  Juan E. Tapiador,et al.  Anomaly detection methods in wired networks: a survey and taxonomy , 2004, Comput. Commun..

[11]  M. Uchida,et al.  A Study on an Extended Formula of Divergence Measures using Invariance , 2005 .

[12]  Jung-Min Park,et al.  An overview of anomaly detection techniques: Existing solutions and latest technological trends , 2007, Comput. Networks.

[13]  Donald F. Towsley,et al.  Detecting anomalies in network traffic using maximum entropy estimation , 2005, IMC '05.

[14]  Carsten Lund,et al.  Estimating flow distributions from sampled flow statistics , 2005, TNET.

[15]  Kang G. Shin,et al.  Change-point monitoring for the detection of DoS attacks , 2004, IEEE Transactions on Dependable and Secure Computing.

[16]  Hiroyuki Shioya,et al.  Design of an Unsupervised Weight Parameter Estimation Method in Ensemble Learning , 2007, ICONIP.

[17]  George C. Polyzos,et al.  Application of sampling methodologies to network traffic characterization , 1993, SIGCOMM '93.

[18]  Alefiya Hussain,et al.  Effect of Malicious Traffic on the Network , 2003 .

[19]  Nick Duffield,et al.  Sampling for Passive Internet Measurement: A Review , 2004 .

[20]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[21]  Albert G. Greenberg,et al.  A Framework for Packet Selection and Reporting , 2009, RFC.

[22]  Martin May,et al.  Impact of packet sampling on anomaly detection metrics , 2006, IMC '06.

[23]  M Estevez-TapiadorJuan,et al.  Anomaly detection methods in wired networks , 2004 .

[24]  Hui Zang,et al.  Impact of Packet Sampling on Portscan Detection , 2006, IEEE Journal on Selected Areas in Communications.