Jellyfish: Locality-Sensitive Subflow Sketching

To cope with increasing network rates and massive traffic volumes, sketch-based methods have been extensively studied to trade accuracy for memory scalability and storage cost. However, sketches are sensitive to hash collisions due to skewed keys in real world environment, and need complicated performance control for line-rate packet streams.We present Jellyfish, a locality-sensitive sketching framework to address these issues. Jellyfish goes beyond network flow-based sketching towards fragments of network flows called subflows. First, Jellyfish splits consecutive packets from each network flow to subflow records, which not only reduces the rate contention but also provides intermediate subflow representations in form of truncated counters. Next, Jellyfish maps similar subflow records to the same bucket array and merges those from the same network flow to reconstruct the network-flow level counters. Real-world trace-driven experiments show that Jellyfish reduces the average estimation errors by up to six orders of magnitude for per-flow queries, by six orders of magnitude for entropy queries, and up to ten times for heavy-hitter queries.

[1]  Wei Wang,et al.  Noisy Bloom Filters for Multi-Set Membership Testing , 2016, SIGMETRICS.

[2]  Peng Liu,et al.  Elastic sketch: adaptive and fast network-wide measurements , 2018, SIGCOMM.

[3]  Abhishek Kumar,et al.  Data streaming algorithms for efficient and accurate estimation of flow size distribution , 2004, SIGMETRICS '04/Performance '04.

[4]  Graham Cormode,et al.  An improved data stream summary: the count-min sketch and its applications , 2004, J. Algorithms.

[5]  Balachander Krishnamurthy,et al.  Sketch-based change detection: methods, evaluation, and applications , 2003, IMC '03.

[6]  Moses Charikar,et al.  Finding frequent items in data streams , 2002, Theor. Comput. Sci..

[7]  S. Muthukrishnan,et al.  Heavy-Hitter Detection Entirely in the Data Plane , 2016, SOSR.

[8]  Roy Friedman,et al.  Constant Time Updates in Hierarchical Heavy Hitters , 2017, SIGCOMM.

[9]  Dong Zhou,et al.  Scalable, high performance ethernet forwarding with CuckooSwitch , 2013, CoNEXT.

[10]  Vyas Sekar,et al.  Data streaming algorithms for estimating entropy of network traffic , 2006, SIGMETRICS '06/Performance '06.

[11]  Kai Chen,et al.  Clustering-preserving Network Flow Sketching , 2020, IEEE INFOCOM 2020 - IEEE Conference on Computer Communications.

[12]  Andrea Montanari,et al.  Counter braids: a novel counter architecture for per-flow measurement , 2008, SIGMETRICS '08.

[13]  Ramesh Govindan,et al.  Trumpet: Timely and Precise Triggers in Data Centers , 2016, SIGCOMM.

[14]  Patrick P. C. Lee,et al.  Sketchlearn: relieving user burdens in approximate measurement with automated statistical inference , 2018, SIGCOMM.

[15]  Roy Friedman,et al.  Nitrosketch: robust and general sketch-based monitoring in software switches , 2019, SIGCOMM.

[16]  Feng Zhao,et al.  CSR: Classified Source Routing in Distributed Networks , 2018, IEEE Transactions on Cloud Computing.

[17]  Graham Cormode,et al.  Data sketching , 2017, Commun. ACM.

[18]  Jih-Kwon Peir,et al.  Fit a Compact Spread Estimator in Small High-Speed Memory , 2011, IEEE/ACM Transactions on Networking.

[19]  Haitao Wu,et al.  CubicRing: Exploiting Network Proximity for Distributed In-Memory Key-Value Store , 2017, IEEE/ACM Transactions on Networking.

[20]  Divyakant Agrawal,et al.  Efficient Computation of Frequent and Top-k Elements in Data Streams , 2005, ICDT.

[21]  Xin Jin,et al.  SketchVisor: Robust Network Measurement for Software Packet Processing , 2017, SIGCOMM.

[22]  Wei Bai,et al.  OmniMon: Re-architecting Network Telemetry with Resource Efficiency and Full Accuracy , 2020, SIGCOMM.

[23]  Vladimir Braverman,et al.  One Sketch to Rule Them All: Rethinking Network Flow Monitoring with UnivMon , 2016, SIGCOMM.

[24]  George Varghese,et al.  New directions in traffic measurement and accounting: Focusing on the elephants, ignoring the mice , 2003, TOCS.

[25]  Bin Fan,et al.  Cuckoo Filter: Practically Better Than Bloom , 2014, CoNEXT.

[26]  Marios Hadjieleftheriou,et al.  Finding the frequent items in streams of data , 2009, CACM.