Bridging the gap between applications and networks in data centers

Modern data centers host tens (if not hundreds) of thousands of servers and are used by companies such as Amazon, Google, and Microsoft to provide online services to millions of individuals distributed across the Internet. They use commodity hardware and their network infrastructure adopts principles evolved from enterprise and Internet networking. Applications use UDP datagrams or TCP sockets as the primary interface to other applications running inside the data center. This effectively isolates the network from the end-systems, which then have little control over how the network handles packets. Likewise, the network has limited visibility on the application logic. An application injects a packet with a destination address and the network just delivers the packet. Network and applications effectively treat each other as black-boxes. This strict separation between applications and networks (also referred to as dumb network) is a direct outcome of the so-called end-to-end argument [49] and has arguably been one of the main reasons why the Internet has been capable of evolving from a small research project to planetary scale, supporting a multitude of different hardware and network technologies as well as a slew of very diverse applications, and using networks owned by competing ISPs. Despite being so instrumental in the success of the Internet, this black-box design is also one of the root causes of inefficiencies in large-scale data centers. Given the little control and visibility over network resources, applications need to use low-level hacks, e.g., to extract network properties (e.g., using traceroute and IP addresses to infer the network topology) and to prioritize traffic (e.g., increasing the number of TCP flows used by an application to increase its bandwidth share). Further, a simple functionality like multicast or anycast routing is not available and developers must resort to application-level overlays. This, however, leads to inefficiencies as typically multiple logical links are mapped to the same physical link, significantly reducing application throughput. Even with perfect knowledge of the underlying topology, there is still the constraint that servers

[1]  Katerina J. Argyraki,et al.  Toward Predictable Performance in Software Packet-Processing Platforms , 2012, NSDI.

[2]  Hitesh Ballani,et al.  Towards predictable datacenter networks , 2011, SIGCOMM 2011.

[3]  Katerina J. Argyraki,et al.  RouteBricks: exploiting parallelism to scale software routers , 2009, SOSP '09.

[4]  Antony I. T. Rowstron,et al.  The price is right: towards location-independent costs in datacenters , 2011, HotNets-X.

[5]  Luigi Rizzo,et al.  netmap: A Novel Framework for Fast Packet I/O , 2012, USENIX ATC.

[6]  Hakim Weatherspoon,et al.  NetSlices: Scalable multi-core packet processing in user-space , 2012, 2012 ACM/IEEE Symposium on Architectures for Networking and Communications Systems (ANCS).

[7]  Antony I. T. Rowstron,et al.  Camdoop: Exploiting In-network Aggregation for Big Data Applications , 2012, NSDI.

[8]  Antony I. T. Rowstron,et al.  Symbiotic routing in future data centers , 2010, SIGCOMM '10.

[9]  G.J. Minden,et al.  A survey of active network research , 1997, IEEE Communications Magazine.

[10]  Zheng Shao,et al.  Data warehousing and analytics infrastructure at facebook , 2010, SIGMOD Conference.

[11]  Amin Vahdat,et al.  A scalable, commodity data center network architecture , 2008, SIGCOMM '08.

[12]  Ken Yocum,et al.  In-situ MapReduce for Log Processing , 2011, USENIX Annual Technical Conference.

[13]  Nick McKeown,et al.  OpenFlow: enabling innovation in campus networks , 2008, CCRV.

[14]  Emin Gün Sirer,et al.  SideCar: building programmable datacenter networks without programmable switches , 2010, Hotnets-IX.

[15]  Albert G. Greenberg,et al.  VL2: a scalable and flexible data center network , 2009, SIGCOMM '09.

[16]  Jerome H. Saltzer,et al.  End-to-end arguments in system design , 1984, TOCS.

[17]  Konstantina Papagiannaki,et al.  c-Through: part-time optics in data centers , 2010, SIGCOMM '10.

[18]  Glen Gibb,et al.  NetFPGA: reusable router architecture for experimental research , 2008, PRESTO '08.

[19]  Eddie Kohler,et al.  The Click modular router , 1999, SOSP.

[20]  Michael I. Jordan,et al.  Managing data transfers in computer clusters with orchestra , 2011, SIGCOMM.

[21]  David Wetherall,et al.  Active network vision and reality: lessons from a capsule-based system , 2002, Proceedings DARPA Active Networks Conference and Exposition.

[22]  Wilson C. Hsieh,et al.  Bigtable: A Distributed Storage System for Structured Data , 2006, TOCS.

[23]  Donald F. Towsley,et al.  Generic Multicast Transport Services: Router Support for Multicast Applications , 2000, NETWORKING.

[24]  Luiz André Barroso,et al.  The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines , 2009, The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines.

[25]  Behrooz Parhami,et al.  Introduction to Parallel Processing: Algorithms and Architectures , 1999 .

[26]  Amin Vahdat,et al.  Switching the optical divide: fundamental challenges for hybrid electrical/optical datacenter networks , 2011, SoCC.

[27]  Lei Shi,et al.  Dcell: a scalable and fault-tolerant network structure for data centers , 2008, SIGCOMM '08.

[28]  Vyas Sekar,et al.  Design and Implementation of a Consolidated Middlebox Architecture , 2012, NSDI.

[29]  SekarVyas,et al.  Making middleboxes someone else's problem , 2012 .

[30]  Haitao Wu,et al.  ServerSwitch: A Programmable and High Performance Platform for Data Center Networks , 2011, NSDI.

[31]  Vyas Sekar,et al.  Multi-resource fair queueing for packet processing , 2012, CCRV.

[32]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[33]  Alexander L. Wolf,et al.  NaaS: Network-as-a-Service in the Cloud , 2012, Hot-ICE.

[34]  David Wetherall,et al.  Active network vision and reality: lessions from a capsule-based system , 1999, SOSP.

[35]  Van Jacobson,et al.  Networking named content , 2009, CoNEXT '09.

[36]  Alexander L. Wolf,et al.  Forwarding in a content-based network , 2003, SIGCOMM '03.

[37]  Amin Vahdat,et al.  Hedera: Dynamic Flow Scheduling for Data Center Networks , 2010, NSDI.

[38]  GhemawatSanjay,et al.  The Google file system , 2003 .

[39]  Vyas Sekar,et al.  Making middleboxes someone else's problem: network processing as a cloud service , 2012, SIGCOMM '12.

[40]  Paramvir Bahl,et al.  Flyways To De-Congest Data Center Networks , 2009, HotNets.

[41]  Vyas Sekar,et al.  SmartRE: an architecture for coordinated network-wide redundancy elimination , 2009, SIGCOMM '09.

[42]  Simon L. Peyton Jones,et al.  Towards Haskell in the cloud , 2012, Haskell '11.

[43]  Antony I. T. Rowstron,et al.  Bridging the tenant-provider gap in cloud services , 2012, SoCC '12.

[44]  Martín Casado,et al.  Applying NOX to the Datacenter , 2009, HotNets.

[45]  David A. Patterson,et al.  Technical perspective: the data center is the computer , 2008, CACM.

[46]  Sangjin Han,et al.  PacketShader: a GPU-accelerated software router , 2010, SIGCOMM '10.