A cost-effective low-latency overlaid torus-based data center network architecture

Abstract In this paper, we present the design, analysis, and implementation of a novel data center network architecture named CLOT , which delivers significant reduction in the network diameter, network latency, and infrastructure cost. CLOT is built based on a switchless torus topology by adding only a number of most beneficial low-end switches in a proper way. Forming the servers in close proximity of each other in torus topology well implements the network locality. The extra layer of switches largely shortens the average routing path length of torus network, which increases the communication efficiency. We show that CLOT can achieve lower latency, smaller routing path length, higher bisection bandwidth and throughput, and better fault tolerance compared to both conventional hierarchical data center networks as well as the recently proposed CamCube network. Coupled with the coordinate based geographical addresses and credit based flow control, the specially designed POW routing algorithm helps CLOT achieve its maximum theoretical performance. Besides, an automatic address configuration mechanism and malfunction detection mechanism are provided to facilitate the network deployment and configuration. The sufficient mathematical analysis and theoretical derivation prove both guaranteed and ideal performance of CLOT .

[1]  Mounir Hamdi,et al.  SprintNet: A high performance server-centric network architecture for data centers , 2014, 2014 IEEE International Conference on Communications (ICC).

[2]  Amin Vahdat,et al.  Data Center Switch Architecture in the Age of Merchant Silicon , 2009, 2009 17th IEEE Symposium on High Performance Interconnects.

[3]  Ankit Singla,et al.  OSA: An Optical Switching Architecture for Data Center Networks With Unprecedented Flexibility , 2012, IEEE/ACM Transactions on Networking.

[4]  Sebti Foufou,et al.  A general framework for performance guaranteed green data center networking , 2014, 2014 IEEE Global Communications Conference.

[5]  Amin Vahdat,et al.  A scalable, commodity data center network architecture , 2008, SIGCOMM '08.

[6]  Leslie G. Valiant,et al.  Universal schemes for parallel communication , 1981, STOC '81.

[7]  Lei Shi,et al.  Dcell: a scalable and fault-tolerant network structure for data centers , 2008, SIGCOMM '08.

[8]  Philip Heidelberger,et al.  Blue Gene/L torus interconnection network , 2005, IBM J. Res. Dev..

[9]  Dong Lin,et al.  Improving the efficiency of server-centric data center network architectures , 2014, 2014 IEEE International Conference on Communications (ICC).

[10]  Haitao Wu,et al.  BCube: a high performance, server-centric network architecture for modular data centers , 2009, SIGCOMM '09.

[11]  Mounir Hamdi,et al.  CLOT: A cost-effective low-latency overlaid torus-based network architecture for data centers , 2015, 2015 IEEE International Conference on Communications (ICC).

[12]  Zhiyang Su,et al.  Rethinking the Data Center Networking: Architecture, Network Protocols, and Resource Sharing , 2014, IEEE Access.

[13]  Burkhard D. Steinmacher-Burow,et al.  The IBM Blue Gene/Q Interconnection Fabric , 2012, IEEE Micro.

[14]  Antony I. T. Rowstron,et al.  Symbiotic routing in future data centers , 2010, SIGCOMM '10.

[15]  Mounir Hamdi,et al.  Designing efficient high performance server-centric data center network architecture , 2015, Comput. Networks.

[16]  William J. Dally,et al.  Principles and Practices of Interconnection Networks , 2004 .

[17]  Dong Lin,et al.  FlatNet: Towards a flatter data center network , 2012, 2012 IEEE Global Communications Conference (GLOBECOM).

[18]  Bo Qin,et al.  NovaCube: A low latency Torus-based network architecture for data centers , 2014, 2014 IEEE Global Communications Conference.

[19]  Luis Gravano,et al.  Adaptive Deadlock- and Livelock-Free Routing with All Minimal Paths in Torus Networks , 1994, IEEE Trans. Parallel Distributed Syst..

[20]  Sebti Foufou,et al.  Towards bandwidth guaranteed energy efficient data center networking , 2015, Journal of Cloud Computing.

[21]  Emin Gün Sirer,et al.  Small-world datacenters , 2011, SoCC.

[22]  William J. Dally,et al.  Deadlock-Free Adaptive Routing in Multicomputer Networks Using Virtual Channels , 1993, IEEE Trans. Parallel Distributed Syst..

[23]  Larry Kaplan,et al.  The Gemini System Interconnect , 2010, 2010 18th IEEE Symposium on High Performance Interconnects.