Larry: Practical Network Reconfigurability in the Data Center

Modern data center (DC) applications require high cross-rack network bandwidth and ultra-low, predictable end-to-end latency. It is hard to meet these requirements in traditional DC networks where the bandwidth between a Top-of-Rack (ToR) switch and the rest of the DC is typically oversubscribed. Larry is a network design that allows racks to dynamically adapt their bandwidth to the aggregation switches as a function of the traffic demand. Larry reconfigures the network topology to enable racks with high demand to use underutilized uplinks from their neighbors. Operating at the physical layer, Larry has a predictably low traffic forwarding overhead that is adapted to latency sensitive applications. Larry is effective even when deployed on a small set of racks (e.g., 4) because rack traffic demand is not correlated in many DC workloads. It can be deployed incrementally and transparently co-exist with existing non-reconfigurable racks. Our prototype uses a 40 Gbps electrical circuit switch we have built, with a simply local control plane. Using multiple workloads, we show that Larry improves tail latency by to 2.3x for the same network cost.

[1]  Amin Vahdat,et al.  Helios: a hybrid electrical/optical switch architecture for modular data centers , 2010, SIGCOMM '10.

[2]  Konstantina Papagiannaki,et al.  c-Through: part-time optics in data centers , 2010, SIGCOMM 2010.

[3]  T. S. Eugene Ng,et al.  A Tale of Two Topologies: Exploring Convertible Data Center Network Architectures with Flat-tree , 2017, SIGCOMM.

[4]  T. S. Eugene Ng,et al.  Enabling Topological Flexibility for Data Centers Using OmniSwitch , 2015, HotCloud.

[5]  Miguel Castro,et al.  No compromises: distributed transactions with consistency, availability, and performance , 2015, SOSP.

[6]  Antony I. T. Rowstron,et al.  Pelican: A Building Block for Exascale Cold Data Storage , 2014, OSDI.

[7]  He Liu,et al.  Circuit Switching Under the Radar with REACToR , 2014, NSDI.

[8]  Michael Dinitz,et al.  Xpander: Towards Optimal-Performance Datacenters , 2016, CoNEXT.

[9]  Nikhil R. Devanur,et al.  ProjecToR: Agile Reconfigurable Data Center Interconnect , 2016, SIGCOMM.

[10]  Scott Shenker,et al.  Network Requirements for Resource Disaggregation , 2016, OSDI.

[11]  Hong Liu,et al.  Jupiter Rising: A Decade of Clos Topologies and Centralized Control in Google's Datacenter Network , 2015, Comput. Commun. Rev..

[12]  Paramvir Bahl,et al.  Augmenting data center networks with multi-gigabit wireless links , 2011, SIGCOMM 2011.

[13]  Ju Wang,et al.  Windows Azure Storage: a highly available cloud storage service with strong consistency , 2011, SOSP.

[14]  Marcin Wójcik,et al.  Where Has My Time Gone? , 2017, PAM.

[15]  Himanshu Shah,et al.  FireFly , 2014, SIGCOMM.

[16]  Ankit Singla,et al.  Jellyfish: Networking Data Centers Randomly , 2011, NSDI.

[17]  Amin Vahdat,et al.  Integrating microsecond circuit switching into the data center , 2013, SIGCOMM.

[18]  Marina Thottan,et al.  Measuring control plane latency in SDN-enabled switches , 2015, SOSR.

[19]  Amin Vahdat,et al.  A scalable, commodity data center network architecture , 2008, SIGCOMM '08.

[20]  Ankit Singla,et al.  Fat-FREE Topologies , 2016, HotNets.

[21]  Ming Zhang,et al.  Congestion Control for Large-Scale RDMA Deployments , 2015, Comput. Commun. Rev..

[22]  Xiaozhou Li,et al.  Flamingo: Enabling Evolvable HDD-based Near-Line Storage , 2016, FAST.

[23]  Srikanth Kandula,et al.  Achieving high utilization with software-driven WAN , 2013, SIGCOMM.

[24]  Alex C. Snoeren,et al.  Inside the Social Network's (Datacenter) Network , 2015, Comput. Commun. Rev..

[25]  Antony I. T. Rowstron,et al.  IOFlow: a software-defined storage architecture , 2013, SOSP.

[26]  Jinyang Li,et al.  Using One-Sided RDMA Reads to Build a Fast, CPU-Efficient Key-Value Store , 2013, USENIX ATC.

[27]  Thomas E. Anderson,et al.  Subways: a case for redundant, inexpensive data center edge links , 2015, CoNEXT.

[28]  Ben Y. Zhao,et al.  Mirror mirror on the ceiling: flexible wireless links for data centers , 2012, SIGCOMM.

[29]  David G. Andersen,et al.  FaSST: Fast, Scalable and Simple Distributed Transactions with Two-Sided (RDMA) Datagram RPCs , 2016, OSDI.

[30]  Costin Raiciu,et al.  Increasing Datacenter Network Utilisation with GRIN , 2015, NSDI.

[31]  Ankit Singla,et al.  OSA: An Optical Switching Architecture for Data Center Networks With Unprecedented Flexibility , 2012, IEEE/ACM Transactions on Networking.

[32]  Gal Shahaf,et al.  Beyond fat-trees without antennae, mirrors, and disco-balls , 2017, SIGCOMM.

[33]  Christos Gkantsidis,et al.  Filo: Consolidated Consensus as a Cloud Service , 2016, USENIX Annual Technical Conference.

[34]  Zhi Liu,et al.  Troubleshooting blackbox SDN control software with minimal causal sequences , 2014 .

[35]  Antony I. T. Rowstron,et al.  XFabric: A Reconfigurable In-Rack Network for Rack-Scale Computers , 2016, NSDI.

[36]  Fang Hao,et al.  Scotch: Elastically Scaling up SDN Control-Plane using vSwitch based Overlay , 2014, CoNEXT.

[37]  Christoforos E. Kozyrakis,et al.  Flash storage disaggregation , 2016, EuroSys.

[38]  Takao Nishizeki,et al.  Edge-Coloring and f-Coloring for Various Classes of Graphs , 1994, J. Graph Algorithms Appl..