LEGUP: using heterogeneity to reduce the cost of data center network upgrades

Fundamental limitations of traditional data center network architectures have led to the development of architectures that provide enormous bisection bandwidth for up to hundreds of thousands of servers. Because these architectures rely on homogeneous switches, implementing one in a legacy data center usually requires replacing most existing switches. Such forklift upgrades are typically prohibitively expensive; instead, a data center manager should be able to selectively add switches to boost bisection bandwidth. Doing so adds heterogeneity to the network's switches and heterogeneous high-performance interconnection topologies are not well understood. Therefore, we develop the theory of heterogeneous Clos networks. We show that our construction needs only as much link capacity as the classic Clos network to route the same traffic matrices and this bound is the optimal. Placing additional equipment in a highly constrained data center is challenging in practice, however. We propose LEGUP to design the topology and physical arrangement of such network upgrades or expansions. Compared to current solutions, we show that LEGUP finds network upgrades with more bisection bandwidth for half the cost. And when expanding a data center iteratively, LEGUP's network has 265% more bisection bandwidth than an iteratively upgraded fat-tree.

[1]  William J. Dally,et al.  Flattened butterfly: a cost-efficient topology for high-radix networks , 2007, ISCA '07.

[2]  Albert G. Greenberg,et al.  The nature of data center traffic: measurements & analysis , 2009, IMC '09.

[3]  Albert G. Greenberg,et al.  The cost of a cloud: research problems in data center networks , 2008, CCRV.

[4]  Amin Vahdat,et al.  A scalable, commodity data center network architecture , 2008, SIGCOMM '08.

[5]  E. L. Lawler,et al.  Branch-and-Bound Methods: A Survey , 1966, Oper. Res..

[6]  A. Mullin,et al.  Mathematical Theory of Connecting Networks and Telephone Traffic. , 1966 .

[7]  VahdatAmin,et al.  A scalable, commodity data center network architecture , 2008 .

[8]  Martín Casado,et al.  Applying NOX to the Datacenter , 2009, HotNets.

[9]  Mark Handley,et al.  Data center networking with multipath TCP , 2010, Hotnets-IX.

[10]  Jeffrey C. Mogul,et al.  SPAIN: COTS Data-Center Ethernet for Multipathing over Arbitrary Topologies , 2010, NSDI.

[11]  V. Benes,et al.  Mathematical Theory of Connecting Networks and Telephone Traffic. , 1966 .

[12]  Murali S. Kodialam,et al.  Maximum Throughput Routing of Traffic in the Hose Model , 2006, Proceedings IEEE INFOCOM 2006. 25TH IEEE International Conference on Computer Communications.

[13]  Di Yuan,et al.  A Lagrangian Heuristic Based Branch-and-Bound Approach for the Capacitated Network Design Problem , 2000, Oper. Res..

[14]  Gordon T. Wilfong,et al.  Strictly non-blocking WDM cross-connects for heterogeneous networks , 2000, STOC '00.

[15]  Lei Shi,et al.  Dcell: a scalable and fault-tolerant network structure for data centers , 2008, SIGCOMM '08.

[16]  Haitao Wu,et al.  MDCube: a high performance network structure for modular data center interconnection , 2009, CoNEXT '09.

[17]  Ming Zhang,et al.  Understanding data center traffic characteristics , 2010, CCRV.

[18]  Charles E. Leiserson,et al.  Fat-trees: Universal networks for hardware-efficient supercomputing , 1985, IEEE Transactions on Computers.

[19]  Albert G. Greenberg,et al.  A flexible model for resource management in virtual private networks , 1999, SIGCOMM '99.

[20]  Nick McKeown,et al.  Designing a Predictable Internet Backbone with Valiant Load-Balancing , 2005, IWQoS.

[21]  Haitao Wu,et al.  BCube: a high performance, server-centric network architecture for modular data centers , 2009, SIGCOMM '09.

[22]  Charles Clos,et al.  A study of non-blocking switching networks , 1953 .

[23]  Amin Vahdat,et al.  PortLand: a scalable fault-tolerant layer 2 data center network fabric , 2009, SIGCOMM '09.

[24]  Leah Epstein,et al.  An APTAS for Generalized Cost Variable-Sized Bin Packing , 2008, SIAM J. Comput..

[25]  Dana S. Richards,et al.  Steiner tree problems , 1992, Networks.

[26]  Aaron Kershenbaum,et al.  Telecommunications Network Design Algorithms , 1993 .

[27]  Jung Ho Ahn,et al.  HyperX: topology, routing, and packaging of efficient large-scale networks , 2009, Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis.

[28]  Jeffrey D. Ullman,et al.  Worst-Case Performance Bounds for Simple One-Dimensional Packing Algorithms , 1974, SIAM J. Comput..

[29]  Luiz André Barroso,et al.  The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines , 2009, The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines.