A lossless switch for data acquisition networks

The recent trends in software-defined networking (SDN) and network function virtualization (NFV) are boosting the advance of software-based packet processing and forwarding on commodity servers. Although performance has traditionally been the challenge of this approach, this situation changes with modern server platforms. High performance load balancers, proxies, virtual switches and other network functions can be now implemented in software and not limited to specialized commercial hardware, thus reducing cost and increasing the flexibility. In this paper we design a lossless software-based switch for high bandwidth data acquisition (DAQ) networks, using the ATLAS experiment at CERN as a case study. We prove that it can effectively solve the incast pathology arising from the many-to-one communication pattern present in DAQ networks by providing extremely high buffering capabilities. We evaluate this on a commodity server equipped with twelve 10 Gbps Ethernet interfaces providing a total bandwidth of 120 Gbps.

[1]  Olav Lysne,et al.  First experiences with congestion control in InfiniBand hardware , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS).

[2]  Fulvio Risso,et al.  Supporting Fine-Grained Network Functions through Intel DPDK , 2014, 2014 Third European Workshop on Software Defined Networks.

[3]  Dong Zhou,et al.  Scalable, high performance ethernet forwarding with CuckooSwitch , 2013, CoNEXT.

[4]  Lorenzo Masetti,et al.  Boosting Event Building Performance using Infiniband FDR for the CMS Upgrade , 2015 .

[5]  Albert G. Greenberg,et al.  Data center TCP (DCTCP) , 2010, SIGCOMM '10.

[6]  G. Aad,et al.  The ATLAS Experiment at the CERN Large Hadron Collide , 2008 .

[7]  Luigi Rizzo,et al.  netmap: A Novel Framework for Fast Packet I/O , 2012, USENIX ATC.

[8]  A. Airapetian,et al.  The ATLAS high-level trigger, data acquisition and controls Technical Design Report , 2003 .

[9]  William Panduro Vazquez,et al.  The ATLAS Data Acquisition System: from Run 1 to Run 2 , 2016 .

[10]  Sven-Arne Reinemo,et al.  Ethernet for High-Performance Data centers: On the New IEEE Datacenter Bridging Standards , 2010, IEEE Micro.

[11]  Daniel Raumer,et al.  Assessing Soft- and Hardware Bottlenecks in PC-based Packet Forwarding Systems , 2015 .

[12]  Alberto Di Meglio,et al.  CERN openlab Whitepaper on Future IT Challenges in Scientific Research , 2014 .

[13]  Katerina J. Argyraki,et al.  RouteBricks: enabling general purpose network infrastructure , 2011, OPSR.

[14]  Philippe Owezarski,et al.  Time Structure Analysis of the LHCb DAQ Network , 2014 .

[15]  J Gettys,et al.  Bufferbloat: Dark Buffers in the Internet , 2011, IEEE Internet Computing.

[16]  Tommaso Colombo Data-flow Performance Optimisation on Unreliable Networks: the ATLAS Data-Acquisition Case , 2015 .

[17]  Peter Jenni,et al.  ATLAS high-level trigger, data-acquisition and controls : Technical Design Report , 2003 .

[18]  Dominik Scholz,et al.  A Look at Intel’s Dataplane Development Kit , 2014 .

[19]  Yan Zhang,et al.  On Architecture Design, Congestion Notification, TCP Incast and Power Consumption in Data Centers , 2013, IEEE Communications Surveys & Tutorials.

[20]  David Malone,et al.  Analogues between tuning TCP for Data Acquisition and datacenter networks , 2015, 2015 IEEE International Conference on Communications (ICC).

[21]  A. Goshaw The ATLAS Experiment at the CERN Large Hadron Collider , 2008 .

[22]  Daniel Raumer,et al.  Performance characteristics of virtual switching , 2014, 2014 IEEE 3rd International Conference on Cloud Networking (CloudNet).

[23]  Bohn Stafleu van Loghum,et al.  Online … , 2002, LOG IN.