Identifying elephant flows in internet backbone traffic with bloom filters and LRU

Traffic measurements provide critical input for network security, traffic engineering and accounting. Restrained capabilities of computing and storage have motivated recent researches on partial flow maintenance like identifying elephant flows. Considering high false negative probability of traditional algorithms, a novel scheme called BF-LRU (Bloom filters and least recent used) is presented. Our BF-LRU scheme adopts LRU replacement to evict mice flows and Bloom filters representation to conserve heavy hitters. Based on Pareto and hypergeometric distribution, expressions of upper-bound error probability are analyzed in detail. Experiments are conducted based on real Internet traffic data. Simulation results indicate that BF-LRU can not only achieve lower error probability, but also scale up to OC-768 backbone trace without losing any space efficiency.

[1]  Yedidyah Langsam,et al.  Data Structures Using C , 2008 .

[2]  Marco Canini,et al.  Tracking elephant flows in internet backbone traffic with an FPGA-based cache , 2009, 2009 International Conference on Field Programmable Logic and Applications.

[3]  Burton H. Bloom,et al.  Space/time trade-offs in hash coding with allowable errors , 1970, CACM.

[4]  Abhishek Kumar,et al.  Sketch Guided Sampling - Using On-Line Estimates of Flow Size for Adaptive Data Collection , 2006, Proceedings IEEE INFOCOM 2006. 25TH IEEE International Conference on Computer Communications.

[5]  Fengyu Wang,et al.  Identifying heavy-hitter flows fast and accurately , 2010, 2010 2nd International Conference on Future Computer and Communication.

[6]  Shigeki Goto,et al.  On the characteristics of Internet traffic variability: spikes and elephants , 2004, 2004 International Symposium on Applications and the Internet. Proceedings..

[7]  Anja Feldmann,et al.  Deriving traffic demands for operational IP networks: methodology and experience , 2000, SIGCOMM.

[8]  Larry Peterson,et al.  Inter-AS traffic patterns and their implications , 1999, Seamless Interconnection for Universal Services. Global Telecommunications Conference. GLOBECOM'99. (Cat. No.99CH37042).

[9]  Donald F. Towsley,et al.  Modeling, simulation and measurements of queuing delay under long-tail internet traffic , 2003, SIGMETRICS '03.

[10]  A. L. Narasimha Reddy,et al.  Identifying Long-Term High-Bandwidth Flows at a Router , 2001, HiPC.

[11]  Fang Hao,et al.  ACCEL-RATE: a faster mechanism for memory efficient per-flow traffic estimation , 2004, SIGMETRICS '04/Performance '04.

[12]  Murali S. Kodialam,et al.  Runs based traffic estimator (RATE): a simple, memory efficient scheme for per-flow rate estimation , 2004, IEEE INFOCOM 2004.

[13]  Yu Cheng,et al.  ANLS: Adaptive Non-Linear Sampling Method for Accurate Flow Size Measurement , 2012, IEEE Transactions on Communications.

[14]  Eli Upfal,et al.  Probability and Computing: Randomized Algorithms and Probabilistic Analysis , 2005 .

[15]  Allen B. Downey,et al.  Evidence for long-tailed distributions in the internet , 2001, IMW '01.

[16]  Scott Shenker,et al.  On the characteristics and origins of internet flow rates , 2002, SIGCOMM.

[17]  David Plonka,et al.  FlowScan: A Network Traffic Flow Reporting and Visualization Tool , 2000, LISA.

[18]  Cristian Estan,et al.  New directions in traffic measurement and accounting , 2001, IMW '01.