Expanding across time to deliver bandwidth efficiency and low latency

Datacenters need networks that support both low-latency and high-bandwidth packet delivery to meet the stringent requirements of modern applications. We present Opera, a dynamic network that delivers latency-sensitive traffic quickly by relying on multi-hop forwarding in the same way as expander-graph-based approaches, but provides near-optimal bandwidth for bulk flows through direct forwarding over time-varying source-to-destination circuits. The key to Opera's design is the rapid and deterministic reconfiguration of the network, piece-by-piece, such that at any moment in time the network implements an expander graph, yet, integrated across time, the network provides bandwidth-efficient single-hop paths between all racks. We show that Opera supports low-latency traffic with flow completion times comparable to cost-equivalent static topologies, while delivering up to 4x the bandwidth for all-to-all traffic and supporting 60% higher load for published datacenter workloads.

[1]  Alex C. Snoeren,et al.  P-FatTree: A multi-channel datacenter network topology , 2016, HotNets.

[2]  Nikhil R. Devanur,et al.  ProjecToR: Agile Reconfigurable Data Center Interconnect , 2016, SIGCOMM.

[3]  Amin Vahdat,et al.  Helios: a hybrid electrical/optical switch architecture for modular data centers , 2010, SIGCOMM '10.

[4]  David A. Maltz,et al.  Data center TCP (DCTCP) , 2010, SIGCOMM 2010.

[5]  Qunfeng Dong,et al.  WaveCube: A scalable, fault-tolerant, high-performance optical data center architecture , 2015, 2015 IEEE Conference on Computer Communications (INFOCOM).

[6]  Noga Alon,et al.  Eigenvalues and expanders , 1986, Comb..

[7]  Kang Lee,et al.  IEEE 1588 standard for a precision clock synchronization protocol for networked measurement and control systems , 2002, 2nd ISA/IEEE Sensors for Industry Conference,.

[8]  S H Lee,et al.  Reconfigurable array interconnection by photorefractive correlation. , 1994, Applied optics.

[9]  Robert N. M. Watson,et al.  Queues Don't Matter When You Can JUMP Them! , 2015, NSDI.

[10]  Sangeetha Abdu Jyothi,et al.  Measuring and Understanding Throughput of Network Topologies , 2014, SC16: International Conference for High Performance Computing, Networking, Storage and Analysis.

[11]  Amin Vahdat,et al.  Less Is More: Trading a Little Bandwidth for Ultra-Low Latency in the Data Center , 2012, NSDI.

[12]  Ankit Singla,et al.  Jellyfish: Networking Data Centers Randomly , 2011, NSDI.

[13]  Amin Vahdat,et al.  A scalable, commodity data center network architecture , 2008, SIGCOMM '08.

[14]  Flyways To DeCongest Data Center Networks , 2009 .

[15]  Amin Vahdat,et al.  PortLand: a scalable fault-tolerant layer 2 data center network fabric , 2009, SIGCOMM '09.

[16]  John K. Ousterhout,et al.  Homa: a receiver-driven low-latency transport protocol using network priorities , 2018, SIGCOMM.

[17]  Alex C. Snoeren,et al.  Inside the Social Network's (Datacenter) Network , 2015, Comput. Commun. Rev..

[18]  Hakim Weatherspoon,et al.  Shoal: A Lossless Network for High-density and Disaggregated Racks , 2017 .

[19]  Alex C. Snoeren,et al.  RotorNet: A Scalable, Low-complexity, Optical Datacenter Network , 2017, SIGCOMM.

[20]  Chunming Qiao,et al.  Enabling Wide-Spread Communications on Optical Fabric with MegaSwitch , 2017, NSDI.

[21]  Paramvir Bahl,et al.  Flyways To De-Congest Data Center Networks , 2009, HotNets.

[22]  Mark Handley,et al.  Re-architecting datacenter networks and stacks for low latency and high performance , 2017, SIGCOMM.

[23]  Gal Shahaf,et al.  Beyond fat-trees without antennae, mirrors, and disco-balls , 2017, SIGCOMM.

[24]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[25]  Ankit Singla,et al.  OSA: An Optical Switching Architecture for Data Center Networks With Unprecedented Flexibility , 2012, IEEE/ACM Transactions on Networking.

[26]  Amin Vahdat,et al.  Jupiter Rising: A Decade of Clos Topologies and Centralized Control in Google's Datacenter Network , 2015, Comput. Commun. Rev..

[27]  William J. Dally,et al.  Technology-Driven, Highly-Scalable Dragonfly Topology , 2008, 2008 International Symposium on Computer Architecture.

[28]  Xiang-Yang Li,et al.  Diamond: Nesting the Data Center Network With Wireless Rings in 3-D Space , 2016, IEEE/ACM Transactions on Networking.

[29]  Amin Vahdat,et al.  Integrating microsecond circuit switching into the data center , 2013, SIGCOMM.

[30]  Michael Dinitz,et al.  Xpander: Towards Optimal-Performance Datacenters , 2016, CoNEXT.

[31]  Nick McKeown,et al.  Why flow-completion time is the right metric for congestion control , 2006, CCRV.

[32]  N. Linial,et al.  Expander Graphs and their Applications , 2006 .

[33]  Ben Y. Zhao,et al.  Mirror mirror on the ceiling: flexible wireless links for data centers , 2012, CCRV.

[34]  Himanshu Shah,et al.  FireFly , 2014, SIGCOMM.

[35]  Wei Bai,et al.  Information-Agnostic Flow Scheduling for Commodity Data Centers , 2015, NSDI.

[36]  Gautam Kumar,et al.  pHost: distributed near-optimal datacenter transport over commodity network fabric , 2015, CoNEXT.

[37]  Amin Vahdat,et al.  Hedera: Dynamic Flow Scheduling for Data Center Networks , 2010, NSDI.

[38]  Joseph E. Ford,et al.  A Scalable, Partially Configurable Optical Switch for Data Center Networks , 2017, Journal of Lightwave Technology.

[39]  Torsten Hoefler,et al.  Slim Fly: A Cost Effective Low-Diameter Network Topology , 2014, SC14: International Conference for High Performance Computing, Networking, Storage and Analysis.

[40]  Konstantina Papagiannaki,et al.  c-Through: part-time optics in data centers , 2010, SIGCOMM 2010.

[41]  He Liu,et al.  Circuit Switching Under the Radar with REACToR , 2014, NSDI.