An adaptive method for identifying heavy hitters combining sampling and data streaming counting

Identifying heavy hitters is essential for network monitoring, management, charging and etc. Existing methods in the literature have some limitations. How to reduce the memory consumption effectively without compromising identification accuracy is still challenging. In this paper, an adaptive method combining sampling and data streaming counting is proposed, called FSPLC(feedback sampling probabilistic lossy counting). Based on the history information in the flow counter table, FSPLC can adjust the sampling frequency dynamically, and also adapt to the real-time traffic changes. Comparison with state-of-the-art algorithms based on real Internet traces suggests that FSPLC is remarkably efficient and accurate. Experiment results show that FSPLC has 1) 60% lower memory consumption, 2) 15% smaller false-positive ratio.

[1]  ShenkerScott,et al.  A simple algorithm for finding frequent elements in streams and bags , 2003 .

[2]  Yin Zhang,et al.  On the characteristics and origins of internet flow rates , 2002, SIGCOMM '02.

[3]  Thomas Bonald,et al.  Statistical bandwidth sharing: a study of congestion at flow level , 2001, SIGCOMM.

[4]  Konstantina Papagiannaki,et al.  A pragmatic definition of elephants in internet backbone traffic , 2002, IMW '02.

[5]  Rajeev Motwani,et al.  Approximate Frequency Counts over Data Streams , 2012, VLDB.

[6]  Anja Feldmann,et al.  NetFlow: information loss or win? , 2002, IMW '02.

[7]  Richard M. Karp,et al.  A simple algorithm for finding frequent elements in streams and bags , 2003, TODS.

[8]  George Varghese,et al.  New directions in traffic measurement and accounting , 2002, CCRV.

[9]  R. Wilder,et al.  Wide-area Internet traffic patterns and characteristics , 1997, IEEE Netw..

[10]  Alexandre Proutière,et al.  Statistical bandwidth sharing: a study of congestion at flow level , 2001, SIGCOMM.

[11]  Yi Lu,et al.  ElephantTrap: A low cost device for identifying large flows , 2007 .

[12]  Xenofontas A. Dimitropoulos,et al.  Probabilistic lossy counting: an efficient algorithm for finding heavy hitters , 2008, CCRV.

[13]  Chita R. Das,et al.  Design of a Dynamic Priority-Based Fast Path Architecture for On-Chip Interconnects , 2007 .

[14]  George Varghese,et al.  Building a better NetFlow , 2004, SIGCOMM.

[15]  Bill Lin,et al.  A Simple Mechanism for Throttling High-Bandwidth Flows , 2008, J. Electr. Comput. Eng..

[16]  Shigeki Goto,et al.  Identifying elephant flows through periodically sampled packets , 2004, IMC '04.

[17]  Shigeki Goto,et al.  On the characteristics of Internet traffic variability: spikes and elephants , 2004, 2004 International Symposium on Applications and the Internet. Proceedings..