Precise error estimation for sketch-based flow measurement

As a class of approximate measurement approaches, sketching algorithms have significantly improved the estimation of network flow information using limited resources. While these algorithms enjoy sound error-bound analysis under worst-case scenarios, their actual errors can vary significantly with the incoming flow distribution, making their traditional error bounds too "loose" to be useful in practice. In this paper, we propose a simple yet rigorous error estimation method to more precisely analyze the errors for posterior sketch queries by leveraging the knowledge from the sketch counters. This approach will enable network operators to understand how accurate the current measurements are and make appropriate decisions accordingly (e.g., identify potential heavy users or answer "what-if" questions to better provision resources). Theoretical analysis and trace-driven experiments show that our estimated bounds on sketch errors are much tighter than previous ones and match the actual error bounds in most cases.

[1]  Graham Cormode,et al.  An improved data stream summary: the count-min sketch and its applications , 2004, J. Algorithms.

[2]  Nate Foster,et al.  NetCache: Balancing Key-Value Stores with Fast In-Network Caching , 2017, SOSP.

[3]  Patrick P. C. Lee,et al.  Sketchlearn: relieving user burdens in approximate measurement with automated statistical inference , 2018, SIGCOMM.

[4]  Kian Hsiang Low,et al.  FCM-sketch: generic network measurements with data plane support , 2020, CoNEXT.

[5]  Peng Liu,et al.  Elastic sketch: adaptive and fast network-wide measurements , 2018, SIGCOMM.

[6]  Daniel Ting,et al.  Count-Min: Optimal Estimation and Tight Error Bounds using Empirical Error Distributions , 2018, KDD.

[7]  David M. W. Powers,et al.  Applications and Explanations of Zipf’s Law , 1998, CoNLL.

[8]  Mark Filer,et al.  RADWAN: rate adaptive wide area network , 2018, SIGCOMM.

[9]  Roy Friedman,et al.  Constant Time Updates in Hierarchical Heavy Hitters , 2017, SIGCOMM.

[10]  Nick McKeown,et al.  pFabric: minimal near-optimal datacenter transport , 2013, SIGCOMM.

[11]  Tong Yang,et al.  CocoSketch: High-Performance Sketch-Based Measurement Over Arbitrary Partial Key Query , 2021, IEEE/ACM Transactions on Networking.

[12]  RADWAN , 2018, Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication.

[13]  Minlan Yu,et al.  SilkRoad: Making Stateful Layer-4 Load Balancing Fast and Cheap Using Switching ASICs , 2017, SIGCOMM.

[14]  Moses Charikar,et al.  Finding frequent items in data streams , 2002, Theor. Comput. Sci..

[15]  David A. Maltz,et al.  Network traffic characteristics of data centers in the wild , 2010, IMC '10.

[16]  Balachander Krishnamurthy,et al.  Sketch-based change detection: methods, evaluation, and applications , 2003, IMC '03.

[17]  George Varghese,et al.  New directions in traffic measurement and accounting , 2002, SIGCOMM '02.

[18]  Yungang Bao,et al.  Sketchlearn , 2018, Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication.

[19]  George Varghese,et al.  CONGA: distributed congestion-aware load balancing for datacenters , 2015, SIGCOMM.

[20]  Vladimir Braverman,et al.  Memory-Efficient Performance Monitoring on Programmable Switches with Lean Algorithms , 2019, APOCS.

[21]  Ramesh Govindan,et al.  SCREAM: sketch resource allocation for software-defined measurement , 2015, CoNEXT.

[22]  Bin Fan,et al.  Small cache, big effect: provable load balancing for randomly partitioned cluster services , 2011, SoCC.

[23]  Vladimir Braverman,et al.  One Sketch to Rule Them All: Rethinking Network Flow Monitoring with UnivMon , 2016, SIGCOMM.

[24]  Srikanth Kandula,et al.  Traffic engineering with forward fault correction , 2014, SIGCOMM.

[25]  Andrea Montanari,et al.  Counter braids: a novel counter architecture for per-flow measurement , 2008, SIGMETRICS '08.

[26]  Duane Wessels,et al.  High‐performance benchmarking with Web Polygraph , 2004, Softw. Pract. Exp..

[27]  Tong Yang,et al.  Out of Many We are One: Measuring Item Batch with Clock-Sketch , 2021, SIGMOD Conference.

[28]  Xiaozhou Li,et al.  DistCache: Provable Load Balancing for Large-Scale Storage Systems with Distributed Caching , 2019, FAST.

[29]  George Varghese,et al.  New directions in traffic measurement and accounting: Focusing on the elephants, ignoring the mice , 2003, TOCS.

[30]  Divesh Srivastava,et al.  Finding hierarchical heavy hitters in streaming data , 2008, TKDD.