An empirical evaluation of entropy-based traffic anomaly detection

Entropy-based approaches for anomaly detection are appealing since they provide more fine-grained insights than traditional traffic volume analysis. While previous work has demonstrated the benefits of entropy-based anomaly detection, there has been little effort to comprehensively understand the detection power of using entropy-based analysis of multiple traffic distributions in conjunction with each other. We consider two classes of distributions: flow-header features (IP addresses, ports, and flow-sizes), and behavioral features (degree distributions measuring the number of distinct destination/source IPs that each host communicates with). We observe that the timeseries of entropy values of the address and port distributions are strongly correlated with each other and provide very similar anomaly detection capabilities. The behavioral and flow size distributions are less correlated and detect incidents that do not show up as anomalies in the port and address distributions. Further analysis using synthetically generated anomalies also suggests that the port and address distributions have limited utility in detecting scan and bandwidth flood anomalies. Based on our analysis, we discuss important implications for entropy-based anomaly detection.

[1]  Amin Vahdat,et al.  Realistic and responsive network traffic generation , 2006, SIGCOMM.

[2]  Hui Zang,et al.  Is sampled data sufficient for anomaly detection? , 2006, IMC '06.

[3]  Abhishek Kumar,et al.  Data streaming algorithms for efficient and accurate estimation of flow size distribution , 2004, SIGMETRICS '04/Performance '04.

[4]  Bernhard Plattner,et al.  Entropy based worm and anomaly detection in fast IP networks , 2005, 14th IEEE International Workshops on Enabling Technologies: Infrastructure for Collaborative Enterprise (WETICE'05).

[5]  Dan Schnackenberg,et al.  Statistical approaches to DDoS attack detection and response , 2003, Proceedings DARPA Information Survivability Conference and Exposition.

[6]  Martin May,et al.  Impact of packet sampling on anomaly detection metrics , 2006, IMC '06.

[7]  Paul Barford,et al.  A signal analysis of network traffic anomalies , 2002, IMW '02.

[8]  Dong Xiang,et al.  Information-theoretic measures for anomaly detection , 2001, Proceedings 2001 IEEE Symposium on Security and Privacy. S&P 2001.

[9]  Jim Morrison Blaster Revisited , 2004, ACM Queue.

[10]  Mark Crovella,et al.  Mining anomalies using traffic feature distributions , 2005, SIGCOMM '05.

[11]  Paul Barford,et al.  Self-configuring network traffic generation , 2004, IMC '04.

[12]  Matthew Roughan,et al.  Experience in measuring internet backbone traffic variability: Models metrics, measurements and meaning , 2003 .

[13]  Hari Balakrishnan,et al.  Fast portscan detection using sequential hypothesis testing , 2004, IEEE Symposium on Security and Privacy, 2004. Proceedings. 2004.

[14]  George Varghese,et al.  Automatically inferring patterns of resource consumption in network traffic , 2003, SIGCOMM '03.

[15]  Vyas Sekar,et al.  LADS: Large-scale Automated DDoS Detection System , 2006, USENIX Annual Technical Conference, General Track.

[16]  Daphne Koller,et al.  Toward Optimal Feature Selection , 1996, ICML.

[17]  Brian Trammell,et al.  Bidirectional Flow Export Using IP Flow Information Export (IPFIX) , 2008, RFC.

[18]  Eddie Kohler,et al.  Observed Structure of Addresses in IP Traffic , 2002, IEEE/ACM Transactions on Networking.

[19]  Edward Grossman ACM Queue , 2003, CIE.

[20]  Peter Phaal,et al.  InMon Corporation's sFlow: A Method for Monitoring Traffic in Switched and Routed Networks , 2001, RFC.

[21]  Vyas Sekar,et al.  Data streaming algorithms for estimating entropy of network traffic , 2006, SIGMETRICS '06/Performance '06.

[22]  Vinod Yegneswaran,et al.  Internet intrusions: global characteristics and prevalence , 2003, SIGMETRICS '03.

[23]  Marina Thottan,et al.  Anomaly detection in IP networks , 2003, IEEE Trans. Signal Process..

[24]  S. Muthukrishnan,et al.  Detecting malicious network traffic using inverse distributions of packet contents , 2005, MineNet '05.

[25]  Mostafa H. Ammar,et al.  Prefix-preserving IP address anonymization: measurement-based security evaluation and a new cryptography-based scheme , 2004, Comput. Networks.

[26]  Zhi-Li Zhang,et al.  Profiling internet backbone traffic: behavior models and applications , 2005, SIGCOMM '05.

[27]  Donald F. Towsley,et al.  An information-theoretic approach to network monitoring and measurement , 2005, IMC '05.

[28]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .