Improving datacenter performance and robustness with multipath TCP

The latest large-scale data centers offer higher aggregate bandwidth and robustness by creating multiple paths in the core of the network. To utilize this bandwidth requires different flows take different paths, which poses a challenge. In short, a single-path transport seems ill-suited to such networks. We propose using Multipath TCP as a replacement for TCP in such data centers, as it can effectively and seamlessly use available bandwidth, giving improved throughput and better fairness on many topologies. We investigate what causes these benefits, teasing apart the contribution of each of the mechanisms used by MPTCP. Using MPTCP lets us rethink data center networks, with a different mindset as to the relationship between transport protocols, routing and topology. MPTCP enables topologies that single path TCP cannot utilize. As a proof-of-concept, we present a dual-homed variant of the FatTree topology. With MPTCP, this outperforms FatTree for a wide range of workloads, but costs the same. In existing data centers, MPTCP is readily deployable leveraging widely deployed technologies such as ECMP. We have run MPTCP on Amazon EC2 and found that it outperforms TCP by a factor of three when there is path diversity. But the biggest benefits will come when data centers are designed for multipath transports.

[1]  Haitao Wu,et al.  BCube: a high performance, server-centric network architecture for modular data centers , 2009, SIGCOMM '09.

[2]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[3]  K. Holmberg Optmization Models for Routing in Switching Networks of Clos Type with many Stages , 2008 .

[4]  Aleksandra Smiljanic Rate and delay guarantees provided by Clos packet switches with load balancing , 2008, IEEE/ACM Trans. Netw..

[5]  Jeffrey C. Mogul,et al.  SPAIN: COTS Data-Center Ethernet for Multipathing over Arbitrary Topologies , 2010, NSDI.

[6]  Albert G. Greenberg,et al.  VL2: a scalable and flexible data center network , 2009, SIGCOMM '09.

[7]  Amin Vahdat,et al.  Hedera: Dynamic Flow Scheduling for Data Center Networks , 2010, NSDI.

[8]  Torsten Hoefler,et al.  Adaptive Routing Strategies for Modern High Performance Networks , 2008, 2008 16th IEEE Symposium on High Performance Interconnects.

[9]  Amar Phanishayee,et al.  Safe and effective fine-grained TCP retransmissions for datacenter communication , 2009, SIGCOMM '09.

[10]  VahdatAmin,et al.  A scalable, commodity data center network architecture , 2008 .

[11]  Mark Handley,et al.  Design, Implementation and Evaluation of Congestion Control for Multipath TCP , 2011, NSDI.

[12]  Mark Handley,et al.  Coupled Congestion Control for Multipath Transport Protocols , 2011, RFC.

[13]  A. Smiljanic Rate and Delay Guarantees Provided by Clos Packet Switches With Load Balancing , 2008, IEEE/ACM Transactions on Networking.

[14]  Eiji Oki,et al.  Concurrent round-robin-based dispatching schemes for Clos-network switches , 2002, TNET.

[15]  Charles Clos,et al.  A study of non-blocking switching networks , 1953 .

[16]  Mark Handley,et al.  TCP Extensions for Multipath Operation with Multiple Addresses , 2020, RFC.

[17]  Amin Vahdat,et al.  A scalable, commodity data center network architecture , 2008, SIGCOMM '08.