An empirical study on TCP flow interarrival time distribution for normal and anomalous traffic

SUMMARY In this paper, we study the effects of anomalies on the distribution of TCP flow interarrival time process. We show empirically that despite the variety of data networks in size, number of users, applications, and load, the interarrival times of normal flows comply with the Weibull distribution, whereas specific irregularities (anomalies) causes deviations from the distribution. We first estimate the scale and shape parameters and then check the discrepancy of the data from a Weibull distribution with the estimated parameters. We also utilize the Weibull counting model to recheck the conformance of small flow interarrival times with the distribution. We perform our experiments on a diverse variety of traffic data sets from backbone connections to endpoints of academic and commercial networks. Moreover, we propose a window-based anomaly detection method as a possible application of our findings in which we first estimate the Weibull parameters of interarrival times in each window and then check the discrepancy of the data with a Weibull distribution with the estimated parameters and set an alarm whenever the difference is significant. We apply this method on one of our data sets and present the results to clarify the idea and show its capability in detecting volume anomalies. Copyright © 2014 John Wiley & Sons, Ltd.

[1]  G. Stone,et al.  Parameter Estimation for the Weibull Distribution , 1977, IEEE Transactions on Electrical Insulation.

[2]  M. E. Johnson,et al.  Estimating model discrepancy , 1990 .

[3]  Sally Floyd,et al.  Wide-area traffic: the failure of Poisson modeling , 1994 .

[4]  Syed Ali Khayam,et al.  Revisiting Traffic Anomaly Detection Using Software Defined Networking , 2011, RAID.

[5]  N. Padmavathy,et al.  Evaluation of mobile ad hoc network reliability using propagation-based link reliability model , 2013, Reliab. Eng. Syst. Saf..

[6]  Amr Rizk,et al.  Non-asymptotic end-to-end performance bounds for networks with long range dependent fBm cross traffic , 2012, Comput. Networks.

[7]  Wenhong Tian,et al.  Analysis and efficient provisioning of access networks with correlated and bursty arrivals , 2014, Int. J. Commun. Syst..

[8]  Anchare V. Babu,et al.  Analytical model for connectivity of vehicular ad hoc networks in the presence of channel randomness , 2013, Int. J. Commun. Syst..

[9]  Lei Guo,et al.  A reconstructing approach to end-to-end network traffic based on multifractal wavelet model , 2014, Int. J. Commun. Syst..

[10]  Goran T. Djordjevic,et al.  Performance analysis of dual switched diversity over correlated Weibull fading channels with co-channel interference , 2011, Int. J. Commun. Syst..

[11]  Walter Willinger,et al.  Analysis, modeling and generation of self-similar VBR video traffic , 1994, SIGCOMM.

[12]  Tom Fawcett,et al.  An introduction to ROC analysis , 2006, Pattern Recognit. Lett..

[13]  D. W. Scott On optimal and data based histograms , 1979 .

[14]  Walter Willinger,et al.  On the Self-Similar Nature of Ethernet Traffic ( extended version ) , 1995 .

[15]  Nicolas D. Georganas,et al.  On self-similar traffic in ATM queues: definitions, overflow probability bound, and cell delay distribution , 1997, TNET.

[16]  K. Pearson On the Criterion that a Given System of Deviations from the Probable in the Case of a Correlated System of Variables is Such that it Can be Reasonably Supposed to have Arisen from Random Sampling , 1900 .

[17]  Walter Willinger,et al.  On the self-similar nature of Ethernet traffic , 1993, SIGCOMM '93.

[18]  A. K. Erlang The theory of probabilities and telephone conversations , 1909 .

[19]  W. Arkadiew,et al.  XIII. Electric and magnetic spectroscopy , 1925 .

[20]  Azer Bestavros,et al.  Self-similarity in World Wide Web traffic: evidence and possible causes , 1996, SIGMETRICS '96.

[21]  Peng Zhang,et al.  A transform domain-based anomaly detection approach to network-wide traffic , 2014, J. Netw. Comput. Appl..

[22]  Jin Cao,et al.  On the nonstationarity of Internet traffic , 2001, SIGMETRICS '01.

[23]  M. Pustisek,et al.  Empirical analysis and modeling of peer-to-peer traffic flows , 2008, MELECON 2008 - The 14th IEEE Mediterranean Electrotechnical Conference.

[24]  Amir-Hossein Jahangir,et al.  Benford's law behavior of Internet traffic , 2014, J. Netw. Comput. Appl..

[25]  Aleksandra S. Panajotovic,et al.  Performance analysis of system with L-branch selection combining over correlated Weibull fading channels in the presence of cochannel interference , 2010 .

[26]  A. Inoie Audio quality in lossy networks for media-specific forward error correction schemes , 2014, Int. J. Commun. Syst..

[27]  Hiroki Takakura,et al.  Toward a more practical unsupervised anomaly detection system , 2013, Inf. Sci..

[28]  N. H. Shepherd,et al.  Radio wave loss deviation and shadow loss at 900 MHz , 1976, IEEE Transactions on Vehicular Technology.

[29]  Abraham O. Fapojuwo,et al.  Analysis and modeling of a campus wireless network TCP/IP traffic , 2009, Comput. Networks.

[30]  Vern Paxson,et al.  Bro: a system for detecting network intruders in real-time , 1998, Comput. Networks.

[31]  H. Saunders,et al.  Probability, Random Variables and Stochastic Processes (2nd Edition) , 1989 .

[32]  Jian Gong,et al.  Investigation on the IP Flow Inter-Arrival Time in Large-Scale Network , 2007, 2007 International Conference on Wireless Communications, Networking and Mobile Computing.

[33]  A. Cohen,et al.  Maximum Likelihood Estimation in the Weibull Distribution Based On Complete and On Censored Samples , 1965 .

[34]  Yuhui Deng,et al.  Self-similarity: Behind workload reshaping and prediction , 2012, Future Gener. Comput. Syst..

[35]  Amir-Hossein Jahangir,et al.  On the TCP Flow Inter-arrival Times Dsitribution , 2011, 2011 UKSim 5th European Symposium on Computer Modeling and Simulation.

[36]  Christian Callegari,et al.  Improving PCA‐based anomaly detection by using multiple time scale analysis and Kullback–Leibler divergence , 2014, Int. J. Commun. Syst..

[37]  Xiangjian He,et al.  A System for Denial-of-Service Attack Detection Based on Multivariate Correlation Analysis , 2014, IEEE Transactions on Parallel and Distributed Systems.

[38]  Aleksandra S. Panajotovic,et al.  Performance analysis of system with L-branch selection combining over correlated Weibull fading channels in the presence of cochannel interference , 2010, Int. J. Commun. Syst..

[39]  Omolbanin Yazdanbakhsh,et al.  An empirical investigation of Web session workloads: Can self-similarity be explained by deterministic chaos? , 2014, Inf. Process. Manag..

[40]  Mohammed Haddad,et al.  Efficient distributed lifetime optimization algorithm for sensor networks , 2014, Ad Hoc Networks.

[41]  Anchare V. Babu,et al.  Computation of minimum transmit power for network connectivity in vehicular ad hoc networks formed by vehicles with random communication range , 2014, Int. J. Commun. Syst..

[42]  S. W. Roberts,et al.  Control Chart Tests Based on Geometric Moving Averages , 2000, Technometrics.

[43]  Irfan Ul Haq,et al.  What Is the Impact of P2P Traffic on Anomaly Detection? , 2010, RAID.

[44]  Akira Kato,et al.  Traffic Data Repository at the WIDE Project , 2000, USENIX Annual Technical Conference, FREENIX Track.

[45]  Vern Paxson,et al.  Empirically derived analytic models of wide-area TCP connections , 1994, TNET.

[46]  Anja Feldmann,et al.  Characteristics of TCP Connection Arrivals , 2002 .

[47]  Eric T. Bradlow,et al.  Count Models Based on Weibull Interarrival Times , 2008, 1307.5759.

[48]  Andreas Willig,et al.  The role of the Weibull distribution in Internet traffic modeling , 2013, Proceedings of the 2013 25th International Teletraffic Congress (ITC).

[49]  Karl Pearson F.R.S. X. On the criterion that a given system of deviations from the probable in the case of a correlated system of variables is such that it can be reasonably supposed to have arisen from random sampling , 2009 .