On the Data Path Performance of Leaf-Spine Datacenter Fabrics

Modern data center networks must support a multitude of diverse and demanding workloads at low cost and even the most simple architectural choices can impact mission-critical application performance. This forces network architects to continually evaluate tradeoffs between ideal designs and pragmatic, cost effective solutions. In real commercial environments the number of parameters that the architect can control is fairly limited and typically includes only the choice of topology, link speeds, over subscription, and switch buffer sizes. In this paper we provide some guidance to the network architect about the impact these choices have on data path performance. We analyze Leaf-Spine topologies under realistic traffic workloads via high-fidelity simulations and identify what is important for performance and what is not important.

[1]  Amin Vahdat,et al.  Less Is More: Trading a Little Bandwidth for Ultra-Low Latency in the Data Center , 2012, NSDI.

[2]  Albert G. Greenberg,et al.  Data center TCP (DCTCP) , 2010, SIGCOMM '10.

[3]  Nick McKeown,et al.  Deconstructing datacenter packet transport , 2012, HotNets-XI.

[4]  Mark Handley,et al.  Improving datacenter performance and robustness with multipath TCP , 2011, SIGCOMM.

[5]  D. Zats,et al.  DeTail: reducing the flow completion time tail in datacenter networks , 2012, CCRV.

[6]  Albert G. Greenberg,et al.  VL2: a scalable and flexible data center network , 2009, SIGCOMM '09.

[7]  Amin Vahdat,et al.  Hedera: Dynamic Flow Scheduling for Data Center Networks , 2010, NSDI.

[8]  QUTdN QeO,et al.  Random early detection gateways for congestion avoidance , 1993, TNET.

[9]  Antony I. T. Rowstron,et al.  Better never than late: meeting deadlines in datacenter networks , 2011, SIGCOMM.

[10]  Brighten Godfrey,et al.  Finishing flows quickly with preemptive scheduling , 2012, CCRV.

[11]  Albert G. Greenberg,et al.  The cost of a cloud: research problems in data center networks , 2008, CCRV.

[12]  Anthony McGregor,et al.  Performance, Validation and Testing with the Network Simulation Cradle , 2006, 14th IEEE International Symposium on Modeling, Analysis, and Simulation.

[13]  Amin Vahdat,et al.  A scalable, commodity data center network architecture , 2008, SIGCOMM '08.

[14]  Amar Phanishayee,et al.  Safe and effective fine-grained TCP retransmissions for datacenter communication , 2009, SIGCOMM '09.

[15]  A. Varga,et al.  THE OMNET++ DISCRETE EVENT SIMULATION SYSTEM , 2003 .

[16]  Nick McKeown,et al.  Why flow-completion time is the right metric for congestion control , 2006, CCRV.