Enabling flow-based routing control in data center networks using Probe and ECMP

Data center networks often use densely interconnected topologies to provide high bandwidth for internal data exchange. In such network, it is critical to employ effective load balancing schemes so that the bandwidth resources can be fully utilized. A simple and widely adopted scheme is equal-cost multi-path (ECMP) routing, which is generally supported by commodity switches and routers. However, research shows that ECMP cannot always ensure even traffic distribution among multiple paths. Consequently, ECMP cannot guarantee optimal resource utilization. We propose a scheme to complement ECMP with per-flow reroute. The basic idea is to perform ECMP-based load balancing by default. When a congestion occurs on a certain link, we dynamically reroute one or a few big flows to alternative paths to alleviate the congestion. The main contribution of our research is to design a scheme that enables per-flow reroute without introducing any modifications to IP switches and routers. All the flow-based operations and reroute functionalities are implemented in software that are installed on end hosts and centralized controllers. We call this scheme PROBE (Probe and RerOute based on ECMP). PROBE uses a traceroute-like approach to discover alternative paths and modifies packet headers to enable flow-based reroute. We show that PROBE is a low cost, low complexity and feasible scheme that can be easily implemented in existing data center networks that consist of commodity switches and routers.

[1]  Lixin Gao,et al.  DPillar: Scalable Dual-Port Server Interconnection for Data Center Networks , 2010, 2010 Proceedings of 19th International Conference on Computer Communications and Networks.

[2]  Brice Augustin,et al.  Avoiding traceroute anomalies with Paris traceroute , 2006, IMC '06.

[3]  John Moy,et al.  OSPF Version 2 , 1998, RFC.

[4]  Albert G. Greenberg,et al.  VL2: a scalable and flexible data center network , 2009, SIGCOMM '09.

[5]  Gary Scott Malkin,et al.  Traceroute Using an IP Option , 1993, RFC.

[6]  Haitao Wu,et al.  BCube: a high performance, server-centric network architecture for modular data centers , 2009, SIGCOMM '09.

[7]  Martín Casado,et al.  Ethane: taking control of the enterprise , 2007, SIGCOMM '07.

[8]  Albert G. Greenberg,et al.  The nature of data center traffic: measurements & analysis , 2009, IMC '09.

[9]  Paul Francis,et al.  The IP Network Address Translator (NAT) , 1994, RFC.

[10]  Amin Vahdat,et al.  Hedera: Dynamic Flow Scheduling for Data Center Networks , 2010, NSDI.

[11]  Benoit Claise,et al.  Cisco Systems NetFlow Services Export Version 9 , 2004, RFC.

[12]  Amin Vahdat,et al.  PortLand: a scalable fault-tolerant layer 2 data center network fabric , 2009, SIGCOMM '09.

[13]  Nick McKeown,et al.  OpenFlow: enabling innovation in campus networks , 2008, CCRV.

[14]  Christian E. Hopps,et al.  Analysis of an Equal-Cost Multi-Path Algorithm , 2000, RFC.

[15]  Lei Shi,et al.  Dcell: a scalable and fault-tolerant network structure for data centers , 2008, SIGCOMM '08.

[16]  Peter Phaal,et al.  InMon Corporation's sFlow: A Method for Monitoring Traffic in Switched and Routed Networks , 2001, RFC.