A Robust Statistical Estimation of Internet Traffic

A new method of estimating flow characteristics in the Internet is developped in this paper. For this purpose, a new set of random variables (referred to as observables) is defined. When dealing with sampled traffic, these observables can easily be computed from sampled data. By adopting a convenient mouse/elephant dichotomy also {\em dependent on traffic}, it is shown how these variables give a {\em robust } statistical information of long flows. A mathematical framework is developed to estimate the accuracy of the method. As an application, it is shown how one can estimate the number of long TCP flows when only sampled traffic is available. The algorithm proposed is tested against experimental data collected from different types of IP traffic.

[1]  Michael Mitzenmacher,et al.  Dynamic Models for File Sizes and Double Pareto Distributions , 2004, Internet Math..

[2]  Konstantina Papagiannaki,et al.  A pragmatic definition of elephants in internet backbone traffic , 2002, IMW '02.

[3]  Philippe Robert,et al.  Deterministic Versus Probabilistic Packet Sampling in the Internet , 2007, ITC.

[4]  Anja Feldmann,et al.  The changing nature of network traffic: scaling phenomena , 1998, CCRV.

[5]  Philippe Owezarski,et al.  A flow-based model for internet backbone traffic , 2002, IMW '02.

[6]  Anja Feldmann,et al.  Efficient policies for carrying Web traffic over flow-switched networks , 1998, TNET.

[7]  Vishal Misra,et al.  On the tails of web file size distributions , 2001 .

[8]  Fabrice Guillemin,et al.  A flow-based approach to modeling ADSL traffic on an IP backbone link , 2004, Ann. des Télécommunications.

[9]  George Varghese,et al.  New directions in traffic measurement and accounting: Focusing on the elephants, ignoring the mice , 2003, TOCS.

[10]  Armand M. Makowski,et al.  Modeling video traffic using M/G/∞ input processes: a compromise between Markovian and LRD models , 1998, IEEE J. Sel. Areas Commun..

[11]  E CrovellaMark,et al.  Self-similarity in World Wide Web traffic , 1996 .

[12]  George Varghese,et al.  Building a better NetFlow , 2004, SIGCOMM.

[13]  Bruno Baynat,et al.  Using LiTGen, a realistic IP traffic model, to evaluate the impact of burstiness on performance , 2008, SimuTools.

[14]  Patrice Abry,et al.  Wavelet Analysis of Long-Range-Dependent Traffic , 1998, IEEE Trans. Inf. Theory.

[15]  Philippe Robert,et al.  Inverting sampled ADSL traffic , 2005, IEEE International Conference on Communications, 2005. ICC 2005. 2005.

[16]  Carsten Lund,et al.  Properties and prediction of flow statistics from sampled packet streams , 2002, IMW '02.

[17]  Nicolas Hohn,et al.  Inverting sampled traffic , 2003, IEEE/ACM Transactions on Networking.

[18]  Mark Crovella,et al.  Self - similarity in World Wide Web: Evidence and possible causes , 1997 .

[19]  Ron Goldman,et al.  Poisson approximation , 2000, Proceedings Geometric Modeling and Processing 2000. Theory and Applications.