MatrixDCN: a high performance network architecture for large-scale cloud data centers

With the widespread deployment of cloud services, data center networks are developing toward large-scale, multi-path networks. Conventional switching-oriented data center network meets difficulties in terms of scalability and flexibility to support increasing bandwidth requirements for cloud services. To solve this problem, a simple and scalable architecture, MatrixDCN, is proposed in this paper. MatrixDCN is an approximate non-blocking network, in which switches and servers are arranged in rows and columns that compose a matrix structure. A MatrixDCN network can accommodate up to hundreds of thousands of servers without bandwidth bottlenecks. Furthermore, the physical topology of a MatrixDCN network can be designed consistently with its logic topology, which helps to reduce the complexity of the management and maintenance of a data center. An efficient routing algorithm, named fault-avoidance routing FAR, is well designed for MatrixDCN to fully leverage the regularity in the topology. FAR builds two routing tables for a router. A BRT is built based on local topology, and a novel negative routing table NRT is increasingly built based on learned partial network failures, which really avoids the problem of network convergence and further shortens the calculating time of routing tables. FAR also greatly reduces the size of routing tables by introducing NRTs at routers. Theoretical analysis and simulations show that MatrixDCN has advantages on the scalability of topology, network throughput, and the performance of FAR. Copyright © 2015John Wiley & Sons, Ltd.

[1]  Albert G. Greenberg,et al.  The cost of a cloud: research problems in data center networks , 2008, CCRV.

[2]  Konstantina Papagiannaki,et al.  c-Through: part-time optics in data centers , 2010, SIGCOMM 2010.

[3]  Jennifer Rexford,et al.  Floodless in seattle: a scalable ethernet architecture for large enterprises , 2008, SIGCOMM '08.

[4]  Haitao Wu,et al.  BCube: a high performance, server-centric network architecture for modular data centers , 2009, SIGCOMM '09.

[5]  Amin Vahdat,et al.  Helios: a hybrid electrical/optical switch architecture for modular data centers , 2010, SIGCOMM '10.

[6]  J. Moy,et al.  OSPF: Anatomy of an Internet Routing Protocol , 1998 .

[7]  Alejandro López-Ortiz,et al.  REWIRE: An optimization-based framework for unstructured data center network design , 2012, 2012 Proceedings IEEE INFOCOM.

[8]  Amin Vahdat,et al.  PortLand: a scalable fault-tolerant layer 2 data center network fabric , 2009, SIGCOMM '09.

[9]  Lei Shi,et al.  Dcell: a scalable and fault-tolerant network structure for data centers , 2008, SIGCOMM '08.

[10]  Amin Vahdat,et al.  ALIAS: scalable, decentralized label assignment for data centers , 2011, SoCC.

[11]  Xiaoming Fu,et al.  Building mega data center from heterogeneous containers , 2011, 2011 19th IEEE International Conference on Network Protocols.

[12]  Chonho Lee,et al.  A survey of mobile cloud computing: architecture, applications, and approaches , 2013, Wirel. Commun. Mob. Comput..

[13]  Jeffrey C. Mogul,et al.  NetLord: a scalable multi-tenant network architecture for virtualized datacenters , 2011, SIGCOMM 2011.

[14]  László Gyarmati,et al.  Scafida: a scale-free network inspired data center architecture , 2010, CCRV.

[15]  José Duato,et al.  Dynamic Fault Tolerance in Fat Trees , 2011, IEEE Transactions on Computers.

[16]  Amin Vahdat,et al.  A scalable, commodity data center network architecture , 2008, SIGCOMM '08.

[17]  Yunhao Liu,et al.  BCN: Expansible network structures for data centers using hierarchical compound graphs , 2011, 2011 Proceedings IEEE INFOCOM.

[18]  Paramvir Bahl,et al.  Augmenting data center networks with multi-gigabit wireless links , 2011, SIGCOMM 2011.

[19]  Dave Katz,et al.  Bidirectional Forwarding Detection (BFD) for IPv4 and IPv6 (Single Hop) , 2010, RFC.

[20]  Victor C. M. Leung,et al.  Enabling technologies for future data center networking: a primer , 2013, IEEE Network.

[21]  Ben Y. Zhao,et al.  Mirror mirror on the ceiling: flexible wireless links for data centers , 2012, SIGCOMM.

[22]  Jeffrey C. Mogul,et al.  SPAIN: COTS Data-Center Ethernet for Multipathing over Arbitrary Topologies , 2010, NSDI.

[23]  Joseph D. Touch,et al.  Transparent interconnection of lots of links (TRILL): problem and applicability statement , 2022 .

[24]  Lixin Gao,et al.  DPillar: Scalable Dual-Port Server Interconnection for Data Center Networks , 2010, 2010 Proceedings of 19th International Conference on Computer Communications and Networks.

[25]  Victor C. M. Leung,et al.  Receiver-oriented load-balancing and reliable routing in wireless sensor networks , 2009 .

[26]  Min Chen,et al.  Energy equilibrium based on corona structure for wireless sensor networks , 2012, Wirel. Commun. Mob. Comput..

[27]  Haitao Wu,et al.  FiConn: Using Backup Port for Server Interconnection in Data Centers , 2009, IEEE INFOCOM 2009.