Mouse Trapping: A Flow Data Reduction Method

Flow based traffic measurement today is a very important tool for network management but suffers from huge amounts of data and a lack of scalability. Therefore it is important to find methods to reduce that amount of data for applications like long-term archiving or filtering in mediators to improve scalability. A fact that helps here is, that general internet traffic has power-law characteristics and that for many applications it is enough to only look at the large flows. In this work we introduce Mouse Trapping, a flow data reduction method that keeps the large flow records, while the small flow records are aggregated or even removed. We show based on theoretical simulation, that because of the heavy-tail nature of normal flow data, the main part of the traffic is represented by only a few large flow records, while the small flow records represent only a small part of the traffic. In an evaluation with real traffic data we can confirm that the traffic flows are in fact mostly power-law distributed. We can show that with this method, flow data can be reduced by up to 90% if all small flow records are just discarded, affecting only flow records of 5% of the traffic.

[1]  George Varghese,et al.  New directions in traffic measurement and accounting: Focusing on the elephants, ignoring the mice , 2003, TOCS.

[2]  Azer Bestavros,et al.  Self-similarity in World Wide Web traffic: evidence and possible causes , 1996, SIGMETRICS '96.

[3]  Anja Feldmann,et al.  Deriving traffic demands for operational IP networks: methodology and experience , 2000, SIGCOMM.

[4]  Carsten Lund,et al.  Charging from sampled network usage , 2001, IMW '01.

[5]  Walter Willinger,et al.  On the Self-Similar Nature of Ethernet Traffic ( extended version ) , 1995 .

[6]  Danielle Liu,et al.  Application profiling of IP traffic , 2002, 27th Annual IEEE Conference on Local Computer Networks, 2002. Proceedings. LCN 2002..

[7]  Michalis Faloutsos,et al.  On power-law relationships of the Internet topology , 1999, SIGCOMM '99.

[8]  Larry Peterson,et al.  Inter-AS traffic patterns and their implications , 1999, Seamless Interconnection for Universal Services. Global Telecommunications Conference. GLOBECOM'99. (Cat. No.99CH37042).

[9]  Benoit Claise,et al.  Specification of the IP Flow Information Export (IPFIX) Protocol for the Exchange of IP Traffic Flow Information , 2008, RFC.

[10]  Vyas Sekar,et al.  Sparse approximations for high fidelity compression of network traffic data , 2005, IMC '05.

[11]  Abhishek Kumar,et al.  A data streaming algorithm for estimating subpopulation flow size distribution , 2005, SIGMETRICS '05.

[12]  Lada A. Adamic,et al.  The Nature of Markets in the World Wide Web , 1999 .

[13]  Walter Willinger,et al.  On the self-similar nature of Ethernet traffic , 1993, SIGCOMM '93.