Randomized admission policy for efficient top-k and frequency estimation

Network management protocols often require timely and meaningful insight about per flow network traffic. This paper introduces Randomized Admission Policy (RAP) — a novel algorithm for the frequency and top-k estimation problems, which are fundamental in network monitoring. We demonstrate space reductions compared to the alternatives by a factor of up to 32 on real packet traces and up to 128 on heavy-tailed workloads. For top-k identification, RAP exhibits memory savings by a factor of between 4 and 64 depending on the workloads' skewness. These empirical results are backed by formal analysis, indicating the asymptotic space improvement of our probabilistic admission approach. Additionally, we present d-Way RAP, a hardware friendly variant of RAP that empirically maintains its space and accuracy benefits.

[1]  Devavrat Shah,et al.  Maintaining Statistics Counters in Router Line Cards , 2002, IEEE Micro.

[2]  Gabriel Maciá-Fernández,et al.  Anomaly-based network intrusion detection: Techniques, systems and challenges , 2009, Comput. Secur..

[3]  Gil Einziger,et al.  Independent counter estimation buckets , 2015, 2015 IEEE Conference on Computer Communications (INFOCOM).

[4]  Nan Hua,et al.  BRICK: A Novel Exact Active Statistics Counter Architecture , 2011, IEEE/ACM Transactions on Networking.

[5]  Zhi-Li Zhang,et al.  Adaptive random sampling for load change detection , 2002, SIGMETRICS '02.

[6]  Min Chen,et al.  Counter Tree: A Scalable Counter Architecture for Per-Flow Traffic Measurement , 2017, IEEE/ACM Transactions on Networking.

[7]  Stefano Giordano,et al.  Enhancing Counting Bloom Filters Through Huffman-Coded Multilayer Structures , 2010, IEEE/ACM Transactions on Networking.

[8]  Gero Dittmann,et al.  Network Processor Load Balancing for High-Speed Links , 2000 .

[9]  Rade Stanojevic,et al.  Small Active Counters , 2007, IEEE INFOCOM 2007 - 26th IEEE International Conference on Computer Communications.

[10]  Todd L. Heberlein,et al.  Network intrusion detection , 1994, IEEE Network.

[11]  Themis Palpanas,et al.  Frequent items in streaming data: An experimental evaluation of the state-of-the-art , 2009, Data Knowl. Eng..

[12]  Richard M. Karp,et al.  A simple algorithm for finding frequent elements in streams and bags , 2003, TODS.

[13]  Roy Friedman,et al.  TinyLFU: A Highly Efficient Cache Admission Policy , 2014, PDP.

[14]  Yu Gu,et al.  Watch global, cache local: YouTube network traffic at a campus network: measurements and implications , 2008, Electronic Imaging.

[15]  Yossi Matias,et al.  Spectral bloom filters , 2003, SIGMOD '03.

[16]  Andrea Montanari,et al.  Counter braids: a novel counter architecture for per-flow measurement , 2008, SIGMETRICS '08.

[17]  Xenofontas A. Dimitropoulos,et al.  Probabilistic lossy counting: an efficient algorithm for finding heavy hitters , 2008, CCRV.

[18]  Graham Cormode,et al.  An improved data stream summary: the count-min sketch and its applications , 2004, J. Algorithms.

[19]  Ming Zhang,et al.  MicroTE: fine grained traffic engineering for data centers , 2011, CoNEXT '11.

[20]  Roy Friedman,et al.  Optimal elephant flow detection , 2017, IEEE INFOCOM 2017 - IEEE Conference on Computer Communications.

[21]  Iddo Hanniel,et al.  Estimators also need shared values to grow together , 2012, 2012 Proceedings IEEE INFOCOM.

[22]  Rajeev Motwani,et al.  Approximate Frequency Counts over Data Streams , 2012, VLDB.

[23]  Roy Friedman,et al.  Heavy hitters in streams and sliding windows , 2016, IEEE INFOCOM 2016 - The 35th Annual IEEE International Conference on Computer Communications.

[24]  George Varghese,et al.  Efficient implementation of a statistics counter architecture , 2003, SIGMETRICS '03.

[25]  Erik D. Demaine,et al.  Frequency Estimation of Internet Packet Streams with Limited Space , 2002, ESA.

[26]  Divyakant Agrawal,et al.  Efficient Computation of Frequent and Top-k Elements in Data Streams , 2005, ICDT.

[27]  Robert H. Morris,et al.  Counting large numbers of events in small registers , 1978, CACM.

[28]  Marios Hadjieleftheriou,et al.  Finding frequent items in data streams , 2008, Proc. VLDB Endow..

[29]  Shigang Chen,et al.  Per-Flow Traffic Measurement Through Randomized Counter Sharing , 2012, IEEE/ACM Trans. Netw..

[30]  Roy Friedman,et al.  Counting with TinyTable: Every bit counts! , 2015, 2015 IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS).

[31]  George Varghese,et al.  New directions in traffic measurement and accounting , 2002, CCRV.

[32]  Moses Charikar,et al.  Finding frequent items in data streams , 2002, Theor. Comput. Sci..

[33]  Rong Pan,et al.  AF-QCN: Approximate Fairness with Quantized Congestion Notification for Multi-tenanted Data Centers , 2010, 2010 18th IEEE Symposium on High Performance Interconnects.

[34]  George Varghese,et al.  Building a better NetFlow , 2004, SIGCOMM.

[35]  CormodeGraham,et al.  Methods for finding frequent items in data streams , 2010, VLDB 2010.

[36]  Marios Hadjieleftheriou,et al.  Methods for finding frequent items in data streams , 2010, The VLDB Journal.

[37]  Yu Cheng,et al.  DISCO: Memory Efficient and Accurate Flow Statistics for Network Measurement , 2010, 2010 IEEE 30th International Conference on Distributed Computing Systems.