Effective Space Saving

In computer networks, it is important to analyze the traffic and provide insights about flows sending packets through the network, e.g., to prevent overloads and DDoS attacks. In this paper, we introduce Effective Space Saving (ESS), a novel algorithm for Top-K identification, a fundamental problem in network monitoring and management. ESS can identify Top-K flows in the network and answer queries regarding flows’ frequency estimation while guaranteeing a small error and a small memory footprint. ESS tracks the frequency of only a small portion of the flows in two tables. Each entry in these tables records a mapping from a given flow id to its current frequency counter. Of these two tables, the Main table stores flows that are suspected of being the heaviest in the stream in terms of their frequency. The Window table stores other recently observed flows that are contending to enter the Main table. We use a probabilistic eviction mechanism for the Window table that is based on the collected statistics. These mechanisms improve the overall memory to accuracy tradeoff of ESS compared to other known approaches. We demonstrate the effectiveness of ESS on real and synthetic packet traces with varying degrees of skew levels. For different skews, ESS identifies the Top-K flows with smaller frequency error by a factor of between 102 to 105 compared to Space Saving [19] and by a factor of up to 10 compared to RAP [5], the two state of the art competing algorithms.

[1]  Divyakant Agrawal,et al.  Efficient Computation of Frequent and Top-k Elements in Data Streams , 2005, ICDT.

[2]  Roy Friedman,et al.  Heavy hitters in streams and sliding windows , 2016, IEEE INFOCOM 2016 - The 35th Annual IEEE International Conference on Computer Communications.

[3]  Roy Friedman,et al.  Constant Time Updates in Hierarchical Heavy Hitters , 2017, SIGCOMM.

[4]  Michael Zink,et al.  Characteristics of YouTube network traffic at a campus network - Measurements, models, and implications , 2009, Comput. Networks.

[5]  Jiangchuan Liu,et al.  Statistics and Social Network of YouTube Videos , 2008, 2008 16th Interntional Workshop on Quality of Service.

[6]  George Varghese,et al.  Efficient implementation of a statistics counter architecture , 2003, SIGMETRICS '03.

[7]  Moses Charikar,et al.  Finding frequent items in data streams , 2002, Theor. Comput. Sci..

[8]  Richard M. Karp,et al.  A simple algorithm for finding frequent elements in streams and bags , 2003, TODS.

[9]  Hideto Hidaka,et al.  The cache DRAM architecture: a DRAM with an on-chip cache memory , 1990, IEEE Micro.

[10]  Themis Palpanas,et al.  Frequent items in streaming data: An experimental evaluation of the state-of-the-art , 2009, Data Knowl. Eng..

[11]  Rong Pan,et al.  AF-QCN: Approximate Fairness with Quantized Congestion Notification for Multi-tenanted Data Centers , 2010, 2010 18th IEEE Symposium on High Performance Interconnects.

[12]  Graham Cormode,et al.  An improved data stream summary: the count-min sketch and its applications , 2004, J. Algorithms.

[13]  Gabriel Maciá-Fernández,et al.  Anomaly-based network intrusion detection: Techniques, systems and challenges , 2009, Comput. Secur..

[14]  Devavrat Shah,et al.  Maintaining Statistics Counters in Router Line Cards , 2002, IEEE Micro.

[15]  Marios Hadjieleftheriou,et al.  Methods for finding frequent items in data streams , 2010, The VLDB Journal.

[16]  Roy Friedman,et al.  Randomized admission policy for efficient top-k and frequency estimation , 2016, IEEE INFOCOM 2017 - IEEE Conference on Computer Communications.

[17]  Roy Friedman,et al.  Nitrosketch: robust and general sketch-based monitoring in software switches , 2019, SIGCOMM.