FiConn: Using Backup Port for Server Interconnection in Data Centers

The goal of data center networking is to interconnect a large number of server machines with low equipment cost, high and balanced network capacity, and robustness to link/server faults. It is well understood that, the current practice where servers are connected by a tree hierarchy of network switches cannot meet these requirements (8), (9). In this paper, we explore a new server-interconnection struc- ture. We observe that the commodity server machines used in today's data centers usually come with two built-in Ethernet ports, one for network connection and the other left for backup purpose. We believe that, if both ports are actively used in network connections, we can build a low-cost interconnection structure without the expensive higher-level large switches. Our new network design, called FiConn, utilizes both ports and only the low-end commodity switches to form a scalable and highly effective structure. Although the server node degree is only two in this structure, we have proven that FiConn is highly scalable to encompass hundreds of thousands of servers with low diameter and high bisection width. The routing mechanism in FiConn balances different levels of links. We have further developed a low- overhead traffic-aware routing mechanism to improve effective link utilization based on dynamic traffic state. Simulation results have demonstrated that the routing mechanisms indeed achieve high networking throughput.

[1]  Dharma P. Agrawal,et al.  Generalized Hypercube and Hyperbus Structures for a Computer Network , 1984, IEEE Transactions on Computers.

[2]  Howard Gobioff,et al.  The Google file system , 2003, SOSP '03.

[3]  Sajal K. Das,et al.  Book Review: Introduction to Parallel Algorithms and Architectures : Arrays, Trees, Hypercubes by F. T. Leighton (Morgan Kauffman Pub, 1992) , 1992, SIGA.

[4]  William J. Dally,et al.  Flattened butterfly: a cost-efficient topology for high-radix networks , 2007, ISCA '07.

[5]  Lei Shi,et al.  Dcell: a scalable and fault-tolerant network structure for data centers , 2008, SIGCOMM '08.

[6]  F. Leighton,et al.  Introduction to Parallel Algorithms and Architectures: Arrays, Trees, Hypercubes , 1991 .

[7]  Frank Thomson Leighton Introduction to parallel algorithms and architectures: arrays , 1992 .

[8]  Yuan Yu,et al.  Dryad: distributed data-parallel programs from sequential building blocks , 2007, EuroSys '07.

[9]  Behrooz Parhami,et al.  Introduction to Parallel Processing: Algorithms and Architectures , 1999 .

[10]  Amin Vahdat,et al.  A scalable, commodity data center network architecture , 2008, SIGCOMM '08.

[11]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[12]  W. Dally,et al.  Route packets, not wires: on-chip interconnection networks , 2001, Proceedings of the 38th Design Automation Conference (IEEE Cat. No.01CH37232).

[13]  Theodore R. Bashkow,et al.  A large scale, homogeneous, fully distributed parallel machine, I , 1977, ISCA '77.

[14]  Dmitri Loguinov,et al.  Graph-theoretic analysis of structured peer-to-peer systems: routing distances and fault resilience , 2003, IEEE/ACM Transactions on Networking.

[15]  William J. Dally,et al.  Technology-Driven, Highly-Scalable Dragonfly Topology , 2008, 2008 International Symposium on Computer Architecture.

[16]  Laxmi N. Bhuyan,et al.  A general class of processor interconnection strategies , 1982, ISCA '82.