A Comparison of Performance and Accuracy of Measurement Algorithms in Software

Many network functions are moving from hardware to software to get better programmability and lower cost. Measurement is critical to most network functions because getting detailed information about traffic is often the first step to make control decisions and diagnose problems. The key challenge for measurement is how to keep a large number of counters while processing packets at line rate. Previous work on measurement algorithms mostly focuses on reducing memory usage while achieving high accuracy. However, software servers have plenty of memory but incur new challenges of achieving both high performance and high accuracy. In this paper, we revisit the measurement algorithms and data structures under the new metrics of performance and accuracy. We show that saving memory through extra computation is not worthwhile. As a result, a linear hash table and count array outperform more complex data structures such as Cuckoo hashing, Count-Min sketches, and heaps in a variety of scenarios.

[1]  Dong Zhou,et al.  Scalable, high performance ethernet forwarding with CuckooSwitch , 2013, CoNEXT.

[2]  Katerina J. Argyraki,et al.  Toward Predictable Performance in Software Packet-Processing Platforms , 2012, NSDI.

[3]  Carsten Lund,et al.  Estimating flow distributions from sampled flow statistics , 2005, TNET.

[4]  Maurice Herlihy,et al.  Hopscotch Hashing , 2008, DISC.

[5]  Minlan Yu,et al.  Re-evaluating Measurement Algorithms in Software , 2015, HotNets.

[6]  Ping Li,et al.  A New Algorithm for Compressed Counting with Applications in Shannon Entropy Estimation in Dynamic Data , 2011, COLT.

[7]  Nicolas Hohn,et al.  Inverting sampled traffic , 2003, IEEE/ACM Transactions on Networking.

[8]  Ramesh Govindan,et al.  SCREAM: sketch resource allocation for software-defined measurement , 2015, CoNEXT.

[9]  Moses Charikar,et al.  Finding frequent items in data streams , 2004, Theor. Comput. Sci..

[10]  Richard Wang,et al.  OpenFlow-Based Server Load Balancing Gone Wild , 2011, Hot-ICE.

[11]  Luigi Rizzo,et al.  netmap: A Novel Framework for Fast Packet I/O , 2012, USENIX ATC.

[12]  K. K. Ramakrishnan,et al.  NetVM: High Performance and Flexible Networking Using Virtualization on Commodity Platforms , 2014, IEEE Transactions on Network and Service Management.

[13]  Ming Zhang,et al.  MicroTE: fine grained traffic engineering for data centers , 2011, CoNEXT '11.

[14]  Bin Fan,et al.  MemC3: Compact and Concurrent MemCache with Dumber Caching and Smarter Hashing , 2013, NSDI.

[15]  Soheil Ghiasi,et al.  Streaming Solutions for Fine-Grained Network Traffic Measurements and Analysis , 2011, IEEE/ACM Transactions on Networking.

[16]  Graham Cormode,et al.  An improved data stream summary: the count-min sketch and its applications , 2004, J. Algorithms.

[17]  J. Ian Munro,et al.  Robin hood hashing , 1985, 26th Annual Symposium on Foundations of Computer Science (sfcs 1985).

[18]  Bruce G. Lindsay,et al.  Random sampling techniques for space efficient online computation of order statistics of large datasets , 1999, SIGMOD '99.

[19]  Albert G. Greenberg,et al.  Data center TCP (DCTCP) , 2010, SIGCOMM '10.

[20]  Moses Charikar,et al.  Finding frequent items in data streams , 2002, Theor. Comput. Sci..

[21]  Scott Shenker,et al.  Spark: Cluster Computing with Working Sets , 2010, HotCloud.

[22]  David A. Maltz,et al.  Network traffic characteristics of data centers in the wild , 2010, IMC '10.

[23]  S. Muthukrishnan,et al.  Modeling skew in data streams , 2006, SIGMOD Conference.

[24]  Daniel Egloff,et al.  QUANTILE ESTIMATION WITH ADAPTIVE IMPORTANCE SAMPLING , 2010, 1002.4946.

[25]  Sally A. McKee,et al.  Understanding the behavior of in-memory computing workloads , 2014, 2014 IEEE International Symposium on Workload Characterization (IISWC).

[26]  Graham Cormode,et al.  Summarizing and Mining Skewed Data Streams , 2005, SDM.

[27]  Hairong Kuang,et al.  The Hadoop Distributed File System , 2010, 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST).

[28]  Ming-Yang Kao,et al.  Reversible sketches: enabling monitoring and analysis over high-speed data streams , 2007, TNET.

[29]  Eddie Kohler,et al.  The Click modular router , 1999, SOSP.

[30]  Albert G. Greenberg,et al.  Ananta: cloud scale load balancing , 2013, SIGCOMM.

[31]  Minlan Yu,et al.  Software Defined Traffic Measurement with OpenSketch , 2013, NSDI.

[32]  Ying Zhang,et al.  An adaptive flow counting method for anomaly detection in SDN , 2013, CoNEXT.

[33]  Ramesh Govindan,et al.  Resource/accuracy tradeoffs in software-defined measurement , 2013, HotSDN '13.

[34]  Rasmus Pagh,et al.  Cuckoo Hashing , 2001, Encyclopedia of Algorithms.

[35]  Divyakant Agrawal,et al.  Efficient Computation of Frequent and Top-k Elements in Data Streams , 2005, ICDT.

[36]  Scott O. Bradner,et al.  Benchmarking Methodology for Network Interconnect Devices , 1999, RFC.

[37]  Abhishek Kumar,et al.  Data streaming algorithms for efficient and accurate estimation of flow size distribution , 2004, SIGMETRICS '04/Performance '04.

[38]  Thomas Steinke,et al.  Hierarchical Heavy Hitters with the Space Saving Algorithm , 2011, ALENEX.

[39]  Martín Casado,et al.  The Design and Implementation of Open vSwitch , 2015, NSDI.

[40]  Benoit Claise,et al.  Cisco Systems NetFlow Services Export Version 9 , 2004, RFC.

[41]  Robert Morris,et al.  Non-scalable locks are dangerous , 2012 .

[42]  Graham Cormode,et al.  Space efficient mining of multigraph streams , 2005, PODS.

[43]  Sylvia Ratnasamy,et al.  Controlling parallelism in a multicore software router , 2010, PRESTO '10.

[44]  Vyas Sekar,et al.  Data streaming algorithms for estimating entropy of network traffic , 2006, SIGMETRICS '06/Performance '06.

[45]  Sanjeev Khanna,et al.  Space-efficient online computation of quantile summaries , 2001, SIGMOD '01.

[46]  Vyas Sekar,et al.  Revisiting the case for a minimalist approach for network flow monitoring , 2010, IMC '10.

[47]  Sangjin Han,et al.  PacketShader: a GPU-accelerated software router , 2010, SIGCOMM '10.

[48]  Burton H. Bloom,et al.  Space/time trade-offs in hash coding with allowable errors , 1970, CACM.

[49]  George Varghese,et al.  New directions in traffic measurement and accounting , 2002, CCRV.

[50]  Dawn Xiaodong Song,et al.  New Streaming Algorithms for Fast Detection of Superspreaders , 2005, NDSS.

[51]  Xin Jin,et al.  SketchVisor: Robust Network Measurement for Software Packet Processing , 2017, SIGCOMM.

[52]  Richard Veras,et al.  When polyhedral transformations meet SIMD code generation , 2013, PLDI.