Network traffic analysis through statistical signal processing methods

In this thesis, we address three major issues in analyzing network tra c using statistical signal processing methods: Network tra c control and data planes: We decompose enterprise LAN TCP tra c into control and data planes. We use the control plane tra c as a surrogate for the whole combined tra c when analyzing network tra c. We show that the two tra c groups have similar behavior through visual plots and multivariate statistical analysis. Since the control plane tra c has less volume, reducing the analysis to it contributes to higher e ciency and scalability. We compare the two tra c groups using the cross-correlation function and show that dissimilarity between them is an indication of abnormal behavior. We also study the Long-Range Dependence (LRD) behavior of the two groups based on the tra c’’s direction and nd that this allows us to focus on smaller segments of the tra c. Detect periodic behavior in network tra c: We develop an e cient, robust, multivariate approach method to detect periodic behavior in network tra c. The method is based on evaluating the periodogram of several count-feature sequences of a tra c trace and testing the signicance of the peak of each periodogram. Botnet command and control (C2) communication channels tra c: In many botnet variants, bots periodically exchange code and updates. We detect bots by detecting the periodic behavior of their C2 tra c. We use SLINGbot to implement two variants of botnets, TinyP2P and IRC, and show that the C2 tra c of both exhibits periodic behavior. This is true whether we apply the test to the whole or to the control tra c alone. We add background and random noise tra c to C2 tra c to test the performance of the method. We nd that address count sequences are more robust to background tra c since the number of hosts that a given host communicates with during a certain time window is relatively small, hence its e ect on the address count is small. We show that the method’’s performance increases with the increase of the duty cycle and/or the length of the observed tra c, and decreases with the decrease of the period length. Finally, we compare the periodic behavior of C2 tra c to the periodic behavior of E-mail tra c and explain that they can be easily distinguished because E-mail communication tra c uses well known port numbers.

[1]  M. A. Maarof,et al.  Iterative Window Size Estimation on Self-Similarity Measurement for Network Traffic Anomaly Detection , 2004 .

[2]  Sally Floyd,et al.  Wide area traffic: the failure of Poisson modeling , 1995, TNET.

[3]  Alan V. Oppenheim,et al.  Discrete-Time Signal Pro-cessing , 1989 .

[4]  Jason Lee,et al.  The devil and packet trace anonymization , 2006, CCRV.

[5]  Jason Lee,et al.  A first look at modern enterprise traffic , 2005, IMC '05.

[6]  William H. Allen,et al.  On the self-similarity of synthetic traffic for the evaluation of intrusion detection systems , 2003, 2003 Symposium on Applications and the Internet, 2003. Proceedings..

[7]  tcpdump Tcpdump/Libpcap public repository , 2010 .

[8]  Todd A. Anderson,et al.  Requirements for Separation of IP Control and Forwarding , 2003, RFC.

[9]  John McHugh,et al.  Testing Intrusion detection systems: a critique of the 1998 and 1999 DARPA intrusion detection system evaluations as performed by Lincoln Laboratory , 2000, TSEC.

[10]  Charles W. Therrien,et al.  Probability and Random Processes for Electrical and Computer Engineers , 2011 .

[11]  José M. F. Moura,et al.  Detecting Botnets Using Command and Control Traffic , 2009, 2009 Eighth IEEE International Symposium on Network Computing and Applications.

[12]  Richard A. Davis,et al.  Time Series: Theory and Methods , 2013 .

[13]  Chin-Laung Lei,et al.  Inferring Speech Activity from Encrypted Skype Traffic , 2008, IEEE GLOBECOM 2008 - 2008 IEEE Global Telecommunications Conference.

[14]  Azer Bestavros,et al.  Self-similarity in World Wide Web traffic: evidence and possible causes , 1996, SIGMETRICS '96.

[15]  D. B. Preston Spectral Analysis and Time Series , 1983 .

[16]  Vyas Sekar,et al.  An empirical evaluation of entropy-based traffic anomaly detection , 2008, IMC '08.

[17]  José M. F. Moura,et al.  Periodic Behavior in Botnet Command and Control Channels Traffic , 2009, GLOBECOM 2009 - 2009 IEEE Global Telecommunications Conference.

[18]  Andrew R. Webb,et al.  Statistical Pattern Recognition , 1999 .

[19]  Tushar Ranka Taxonomy of Botnet Threats , 2006 .

[20]  Walter Willinger,et al.  On the self-similar nature of Ethernet traffic , 1993, SIGCOMM '93.

[21]  D. Veitch Wavelet Analysis of Long Range Dependent Traac , 1998 .

[22]  H. Jonathan Chao,et al.  PacketScore: a statistics-based packet filtering scheme against distributed denial-of-service attacks , 2006, IEEE Transactions on Dependable and Secure Computing.

[23]  Richard A. Davis,et al.  Time Series: Theory and Methods (2nd ed.). , 1992 .

[24]  J. Wade Davis,et al.  Statistical Pattern Recognition , 2003, Technometrics.

[25]  Richard Lippmann,et al.  The 1999 DARPA off-line intrusion detection evaluation , 2000, Comput. Networks.

[26]  Carol G. Maclennan,et al.  Study of tidal periodicities using a Transatlantic telecommunications cable , 1986 .

[27]  W. Timothy Strayer,et al.  SLINGbot: A System for Live Investigation of Next Generation Botnets , 2009, 2009 Cybersecurity Applications & Technology Conference for Homeland Security.

[28]  Ming Li,et al.  Decision analysis of network-based intrusion detection systems for denial-of-service attacks , 2001, 2001 International Conferences on Info-Tech and Info-Net. Proceedings (Cat. No.01EX479).

[29]  Houssain Kettani,et al.  A novel approach to the estimation of the long-range dependence parameter , 2006, IEEE Transactions on Circuits and Systems II: Express Briefs.

[30]  O. Yli-Harja,et al.  Robust Fisher's Test for Periodicity Detection in Noisy Biological Time Series , 2007, 2007 IEEE International Workshop on Genomic Signal Processing and Statistics.

[31]  Athina P. Petropulu,et al.  Long-range dependence and heavy-tail modeling for teletraffic data , 2002, IEEE Signal Process. Mag..

[32]  S. Leigh,et al.  Probability and Random Processes for Electrical Engineering , 1989 .

[33]  David A. Nash,et al.  Simulation of self-similarity in network utilization patterns as a precursor to automated testing of intrusion detection systems , 2001, IEEE Trans. Syst. Man Cybern. Part A.

[34]  Alberto Dainotti,et al.  Wavelet-based Detection of DoS Attacks. , 2006 .

[35]  Yiu-Tong Chan,et al.  Comparison of various periodograms for sinusoid detection and frequency estimation , 1999 .

[36]  George Varghese,et al.  On Scalable Attack Detection in the Network , 2004, IEEE/ACM Transactions on Networking.

[37]  José M. F. Moura,et al.  Network traffic behavior analysis by decomposition into control and data planes , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing.

[38]  Rajesh Krishnan,et al.  Using signal processing to analyze wireless data traffic , 2002, WiSE '02.

[39]  Patrice Abry,et al.  Wavelet Analysis of Long-Range-Dependent Traffic , 1998, IEEE Trans. Inf. Theory.

[40]  Ravi Sandhu,et al.  ACM Transactions on Information and System Security: Editorial , 2005 .

[41]  Michalis Faloutsos,et al.  Long-range dependence ten years of Internet traffic modeling , 2004, IEEE Internet Computing.

[42]  W. Schleifer,et al.  Online error detection through observation of traffic self-similarity , 2001 .

[43]  Michalis Faloutsos,et al.  A nonstationary Poisson view of Internet traffic , 2004, IEEE INFOCOM 2004.