Probabilistic fading counting: An efficient identification method for finding heavy hitters

Identifying heavy hitter flows in the network is of tremendous importance for many network management activities. The problem of how to find these flows has been the concern of many studies in the past few years. Lossy counting [12] and probabilistic lossy counting [11] are among the most well-known algorithms for finding heavy hitters. But these methods have some limitations. The challenge is finding a way to reduce the memory consumption effectively while achieving better accuracy. In this work, we introduce a probabilistic fading method combined with data streaming counting, which is called probabilistic fading lossy counting (PFC). This method absorbs the advantages of data streaming counting, and it manages to find the heavy-hitter by analyzing the power-low characteristic in the network flow. Comparisons with lossy counting and probabilistic lossy counting base on real Internet traces suggest that PFC is remarkably efficient and more accurate. Particularly, experiment results show that PFC has 60% lower memory consumption without increasing the false negative ratio nor the false positive ratio.

[1]  Thomas Bonald,et al.  Statistical bandwidth sharing: a study of congestion at flow level , 2001, SIGCOMM.

[2]  Rajeev Motwani,et al.  Approximate Frequency Counts over Data Streams , 2012, VLDB.

[3]  Konstantina Papagiannaki,et al.  A pragmatic definition of elephants in internet backbone traffic , 2002, IMW '02.

[4]  Xenofontas A. Dimitropoulos,et al.  Probabilistic lossy counting: an efficient algorithm for finding heavy hitters , 2008, CCRV.

[5]  Chita R. Das,et al.  Design of a Dynamic Priority-Based Fast Path Architecture for On-Chip Interconnects , 2007 .

[6]  Bill Lin,et al.  A Simple Mechanism for Throttling High-Bandwidth Flows , 2008, J. Electr. Comput. Eng..

[7]  George Varghese,et al.  Building a better NetFlow , 2004, SIGCOMM 2004.

[8]  Yi Lu,et al.  ElephantTrap: A low cost device for identifying large flows , 2007, 15th Annual IEEE Symposium on High-Performance Interconnects (HOTI 2007).

[9]  John Heidemann,et al.  On the correlation of Internet flow characteristics , 2003 .

[10]  Richard M. Karp,et al.  A simple algorithm for finding frequent elements in streams and bags , 2003, TODS.

[11]  Cristian Estan,et al.  New directions in traffic measurement and accounting , 2001, IMW '01.

[12]  Scott Shenker,et al.  On the characteristics and origins of internet flow rates , 2002, SIGCOMM.

[13]  R. Wilder,et al.  Wide-area Internet traffic patterns and characteristics , 1997, IEEE Netw..

[14]  Shigeki Goto,et al.  Identifying elephant flows through periodically sampled packets , 2004, IMC '04.

[15]  Shigeki Goto,et al.  On the characteristics of Internet traffic variability: spikes and elephants , 2004, 2004 International Symposium on Applications and the Internet. Proceedings..

[16]  Xie Gaogang,et al.  An Improved Adaptive Sampling Method for Traffic Measurement , 2007 .

[17]  Yi Lu,et al.  ElephantTrap: A low cost device for identifying large flows , 2007 .