Statistical methods for computer network traffic analysis

Classical time-series analysis is concerned with data that have weak correlations, Gaussian marginals, and stationarity. Computer network traffic, on the other hand, possesses many complicated and unconventional characteristics such as self-similarity, long-range dependence, heavy-tail marginals, and non-stationarities. Accurate detection and estimation of these features are essential for performance evaluation and traffic modelling. However, the presence of two or more of these features can significantly degrade the performance of statistical estimators, therefore giving poor or incorrect estimates. A critical evaluation of several state-of-the-art statistical methods that are useful for detecting and quantifying the aforementioned properties of network traffic are presented. This is done so as to determine when these methods are most applicable. Numerous experiments are carried out to gain further insights into the strength and limitations of each method. It is found that current statistical tools for estimating the tail exponent of a heavy-tailed process with long-range dependence can produce incorrect results. Hence, we propose a simple wavelet-based method that provides a more accurate estimate of the tail exponent than current existing methods when long-range dependence is present.

[1]  P. Guttorp,et al.  Testing for homogeneity of variance in time series: Long memory, wavelets, and the Nile River , 2002 .

[2]  William Feller,et al.  An Introduction to Probability Theory and Its Applications , 1951 .

[3]  Piotr Kokoszka,et al.  Fractional ARIMA with stable innovations , 1995 .

[4]  J. Beran Statistical methods for data with long-range dependence , 1992 .

[5]  M. Crovella,et al.  Estimating the Heavy Tail Index from Scaling Properties , 1999 .

[6]  Azer Bestavros,et al.  Self-similarity in World Wide Web traffic: evidence and possible causes , 1997, TNET.

[7]  S. Mallat A wavelet tour of signal processing , 1998 .

[8]  C. Klüppelberg,et al.  Modelling Extremal Events , 1997 .

[9]  J. Nolan,et al.  Maximum likelihood estimation and diagnostics for stable distributions , 2001 .

[10]  J. S. Marron,et al.  Long-range dependence in a changing Internet traffic mix , 2005, Comput. Networks.

[11]  Richard A. Davis,et al.  Time Series: Theory and Methods , 2013 .

[12]  Patrice Abry,et al.  Wavelet Analysis of Long-Range-Dependent Traffic , 1998, IEEE Trans. Inf. Theory.

[13]  Walter Willinger,et al.  A pragmatic approach to dealing with high-variability in network measurements , 2004, IMC '04.

[14]  Patrice Abry,et al.  A statistical test for the time constancy of scaling exponents , 2001, IEEE Trans. Signal Process..

[15]  P. Gonçalves,et al.  Diverging Moments and Parameter Estimation , 2005 .

[16]  Richard E. Barlow,et al.  Statistical Theory of Reliability and Life Testing: Probability Models , 1976 .

[17]  M. Taqqu,et al.  Simulation methods for linear fractional stable motion and farima using the fast fourier transform , 2004 .

[18]  Richard A. Davis,et al.  Time Series: Theory and Methods (2nd ed.). , 1992 .

[19]  S. Mallat VI – Wavelet zoom , 1999 .

[20]  Ilkka Norros,et al.  A storage model with self-similar input , 1994, Queueing Syst. Theory Appl..

[21]  PAUL EMBRECHTS,et al.  Modelling of extremal events in insurance and finance , 1994, Math. Methods Oper. Res..

[22]  Walter Willinger,et al.  Self-similarity and heavy tails: structural modeling of network traffic , 1998 .

[23]  Béatrice Pesquet-Popescu,et al.  Statistical properties of the wavelet decomposition of certain non-Gaussian self-similar processes , 1999, Signal Process..

[24]  Dimitrios Hatzinakos,et al.  Network heavy traffic modeling using α-stable self-similar processes , 2001, IEEE Trans. Commun..

[25]  Jin Cao,et al.  On the nonstationarity of Internet traffic , 2001, SIGMETRICS '01.

[26]  FeldmannA.,et al.  Data networks as cascades , 1998 .

[27]  Walter Willinger,et al.  Self-similarity through high-variability: statistical analysis of Ethernet LAN traffic at the source level , 1997, TNET.

[28]  Peter Guttorp,et al.  Multiscale detection and location of multiple variance changes in the presence of long memory , 2000 .

[29]  Walter Willinger,et al.  On the self-similar nature of Ethernet traffic , 1993, SIGCOMM '93.

[30]  D. Applebaum Stable non-Gaussian random processes , 1995, The Mathematical Gazette.

[31]  L. Haan,et al.  A moment estimator for the index of an extreme-value distribution , 1989 .

[32]  Walter Willinger,et al.  Experimental queueing analysis with long-range dependent packet traffic , 1996, TNET.