Optimal elephant flow detection

Monitoring the traffic volumes of elephant flows, including the total byte count per flow, is a fundamental capability for online network measurements. We present an asymptotically optimal algorithm for solving this problem in terms of both space and time complexity. This improves on previous approaches, which can only count the number of packets in constant time. We evaluate our work on real packet traces, demonstrating an up to X2.5 speedup compared to the best alternative.

[1]  Roy Friedman,et al.  Heavy hitters in streams and sliding windows , 2016, IEEE INFOCOM 2016 - The 35th Annual IEEE International Conference on Computer Communications.

[2]  Erik D. Demaine,et al.  Frequency Estimation of Internet Packet Streams with Limited Space , 2002, ESA.

[3]  Iddo Hanniel,et al.  Estimators also need shared values to grow together , 2012, 2012 Proceedings IEEE INFOCOM.

[4]  Ramesh Govindan,et al.  Resource/accuracy tradeoffs in software-defined measurement , 2013, HotSDN '13.

[5]  Marios Hadjieleftheriou,et al.  Methods for finding frequent items in data streams , 2010, The VLDB Journal.

[6]  Yu Gu,et al.  Watch global, cache local: YouTube network traffic at a campus network: measurements and implications , 2008, Electronic Imaging.

[7]  G. Maciá-Fernández,et al.  Anomaly-based network intrusion detection: Techniques, systems and challenges , 2009, Comput. Secur..

[8]  Yehuda Afek,et al.  Sampling and Large Flow Detection in SDN , 2015, SIGCOMM.

[9]  Roy Friedman,et al.  Randomized admission policy for efficient top-k and frequency estimation , 2016, IEEE INFOCOM 2017 - IEEE Conference on Computer Communications.

[10]  Carsten Lund,et al.  Online identification of hierarchical heavy hitters: algorithms, evaluation, and applications , 2004, IMC '04.

[11]  Amin Vahdat,et al.  Hedera: Dynamic Flow Scheduling for Data Center Networks , 2010, NSDI.

[12]  Divyakant Agrawal,et al.  Efficient Computation of Frequent and Top-k Elements in Data Streams , 2005, ICDT.

[13]  João Paulo Carvalho,et al.  Finding top-k elements in data streams , 2010, Inf. Sci..

[14]  Themis Palpanas,et al.  Frequent items in streaming data: An experimental evaluation of the state-of-the-art , 2009, Data Knowl. Eng..

[15]  Richard M. Karp,et al.  A simple algorithm for finding frequent elements in streams and bags , 2003, TODS.

[16]  Yang Li,et al.  CASE: Cache-assisted stretchable estimator for high speed per-flow measurement , 2016, IEEE INFOCOM 2016 - The 35th Annual IEEE International Conference on Computer Communications.

[17]  Andrea Montanari,et al.  Counter braids: a novel counter architecture for per-flow measurement , 2008, SIGMETRICS '08.

[18]  Xenofontas A. Dimitropoulos,et al.  Probabilistic lossy counting: an efficient algorithm for finding heavy hitters , 2008, CCRV.

[19]  Prosenjit Bose,et al.  Bounds for Frequency Estimation of Packet Streams , 2003, SIROCCO.

[20]  Ying Zhang,et al.  An adaptive flow counting method for anomaly detection in SDN , 2013, CoNEXT.

[21]  Graham Cormode,et al.  What's new: finding significant differences in network data streams , 2004, IEEE/ACM Transactions on Networking.

[22]  Marios Hadjieleftheriou,et al.  Finding frequent items in data streams , 2008, Proc. VLDB Endow..

[23]  Maurice Herlihy,et al.  Hopscotch Hashing , 2008, DISC.

[24]  Rajeev Motwani,et al.  Approximate Frequency Counts over Data Streams , 2012, VLDB.

[25]  Amar Phanishayee,et al.  Safe and effective fine-grained TCP retransmissions for datacenter communication , 2009, SIGCOMM '09.

[26]  Todd L. Heberlein,et al.  Network intrusion detection , 1994, IEEE Network.

[27]  Piotr Indyk,et al.  Space-optimal heavy hitters with strong error bounds , 2010, TODS.

[28]  Paul Barford,et al.  Accurate and efficient SLA compliance monitoring , 2007, SIGCOMM '07.

[29]  Roy Friedman,et al.  TinyLFU: A Highly Efficient Cache Admission Policy , 2014, PDP.

[30]  Gaogang Xie,et al.  Mnemonic Lossy Counting: An efficient and accurate heavy-hitters identification algorithm , 2010, International Performance Computing and Communications Conference.

[31]  Yong Guan,et al.  Near-optimal approximate membership query over time-decaying windows , 2013, 2013 Proceedings IEEE INFOCOM.

[32]  Min Chen,et al.  Counter Tree: A Scalable Counter Architecture for Per-Flow Traffic Measurement , 2017, IEEE/ACM Transactions on Networking.

[33]  Gero Dittmann,et al.  Network Processor Load Balancing for High-Speed Links , 2000 .

[34]  Gabriel Maciá-Fernández,et al.  Anomaly-based network intrusion detection: Techniques, systems and challenges , 2009, Comput. Secur..

[35]  Gil Einziger,et al.  Independent counter estimation buckets , 2015, 2015 IEEE Conference on Computer Communications (INFOCOM).

[36]  Sujata Banerjee,et al.  DevoFlow: scaling flow management for high-performance networks , 2011, SIGCOMM.

[37]  Shigang Chen,et al.  Per-Flow Traffic Measurement Through Randomized Counter Sharing , 2012, IEEE/ACM Trans. Netw..

[38]  Roy Friedman,et al.  Counting with TinyTable: Every bit counts! , 2015, 2015 IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS).

[39]  Moses Charikar,et al.  Finding frequent items in data streams , 2002, Theor. Comput. Sci..

[40]  Rong Pan,et al.  AF-QCN: Approximate Fairness with Quantized Congestion Notification for Multi-tenanted Data Centers , 2010, 2010 18th IEEE Symposium on High Performance Interconnects.

[41]  Ming Zhang,et al.  MicroTE: fine grained traffic engineering for data centers , 2011, CoNEXT '11.

[42]  Nan Hua,et al.  Rank-indexed hashing: A compact construction of Bloom filters and variants , 2008, 2008 IEEE International Conference on Network Protocols.

[43]  Graham Cormode,et al.  An improved data stream summary: the count-min sketch and its applications , 2004, J. Algorithms.