Universal Online Sketch for Tracking Heavy Hitters and Estimating Moments of Data Streams

Traffic measurement is key to many network management tasks such as performance monitoring and cyber-security. Its aim is to inspect the packet stream passing through a network device, classify them into flows according to the header fields, and obtain statistics about the flows. For processing big streaming data in size-limited SRAM of line cards, many space-sublinear algorithms have been proposed, such as CountMin and CountSketch. However, most of them are designed for specific measurement tasks. Implementing multiple independent sketches places burden for online operations of a network device. It is highly desired to design a universal sketch that not only tracks individual large flows (called heavy hitters) but also reports overall traffic distribution statistics (called moments). The prior work UnivMon successfully tackled this ambitious quest. However, it incurs large and variable per-packet processing overhead, which may result in a significant throughput bottleneck in high-rate packet streaming, given that each packet requires 65 hashes and 64 memory accesses on average and many times of that in the worst case. To address this performance issue, we need to fundamentally redesign the solution architecture from hierarchical sampling to new progressive sampling and from CountSketch to new ActiveCM+, which ensure that per-packet overhead is a small constant (4 hash and 4 memory accesses) in the worst case, making it much more suitable for online operations, especially for pipeline implementation. The new design also makes effort to reduce memory footprint or equivalently improve measurement accuracy under the same memory. Our experiments show that our solution incurs just one sixteenth per-packet overhead of UnivMon, while improving measurement accuracy by three times under the same memory.

[1]  Andrea Montanari,et al.  Counter braids: a novel counter architecture for per-flow measurement , 2008, SIGMETRICS '08.

[2]  Roy Friedman,et al.  Constant Time Updates in Hierarchical Heavy Hitters , 2017, SIGCOMM.

[3]  Yan Chen,et al.  Reversible sketches for efficient and accurate change detection over network data streams , 2004, IMC '04.

[4]  Vyas Sekar,et al.  Data streaming algorithms for estimating entropy of network traffic , 2006, SIGMETRICS '06/Performance '06.

[5]  Graham Cormode,et al.  An improved data stream summary: the count-min sketch and its applications , 2004, J. Algorithms.

[6]  Shigang Chen,et al.  Highly Compact Virtual Active Counters for Per-flow Traffic Measurement , 2018, IEEE INFOCOM 2018 - IEEE Conference on Computer Communications.

[7]  Roy Friedman,et al.  Pay for a Sliding Bloom Filter and Get Counting, Distinct Elements, and Entropy for Free , 2017, IEEE INFOCOM 2018 - IEEE Conference on Computer Communications.

[8]  Rade Stanojevic,et al.  Small Active Counters , 2007, IEEE INFOCOM 2007 - 26th IEEE International Conference on Computer Communications.

[9]  Cristian Estan,et al.  New directions in traffic measurement and accounting , 2001, IMW '01.

[10]  Divyakant Agrawal,et al.  Fast data stream algorithms using associative memories , 2007, SIGMOD '07.

[11]  Shigang Chen,et al.  Fast and compact per-flow traffic measurement through randomized counter sharing , 2011, 2011 Proceedings IEEE INFOCOM.

[12]  Shigang Chen,et al.  One memory access bloom filters and their generalization , 2011, 2011 Proceedings IEEE INFOCOM.

[13]  Rafail Ostrovsky,et al.  Generalizing the Layering Method of Indyk and Woodruff: Recursive Sketches for Frequency-Based Vectors on Streams , 2013, APPROX-RANDOM.

[14]  Min Chen,et al.  Counter Tree: A Scalable Counter Architecture for Per-Flow Traffic Measurement , 2017, IEEE/ACM Transactions on Networking.

[15]  Minlan Yu,et al.  Software Defined Traffic Measurement with OpenSketch , 2013, NSDI.

[16]  Peng Liu,et al.  Elastic sketch: adaptive and fast network-wide measurements , 2018, SIGCOMM.

[17]  Jun Bi,et al.  A Generic Technique for Sketches to Adapt to Different Counting Ranges , 2019, IEEE INFOCOM 2019 - IEEE Conference on Computer Communications.

[18]  Gustavo Alonso,et al.  Augmented Sketch: Faster and More Accurate Stream Processing , 2016, SIGMOD Conference.

[19]  Divyakant Agrawal,et al.  Efficient Computation of Frequent and Top-k Elements in Data Streams , 2005, ICDT.

[20]  Abhishek Kumar,et al.  Data streaming algorithms for efficient and accurate estimation of flow size distribution , 2004, SIGMETRICS '04/Performance '04.

[21]  Yong Guan,et al.  Identifying High-Cardinality Hosts from Network-Wide Traffic Measurements , 2016, IEEE Trans. Dependable Secur. Comput..

[22]  Vladimir Braverman,et al.  One Sketch to Rule Them All: Rethinking Network Flow Monitoring with UnivMon , 2016, SIGCOMM.

[23]  Moses Charikar,et al.  Finding frequent items in data streams , 2004, Theor. Comput. Sci..