Unlocking Credit Loop Deadlocks

The recently emerging Converged Enhanced Ethernet (CEE) data center networks rely on layer-2 flow control in order to support packet loss sensitive transport protocols, such as RDMA and FCoE. Although lossless networks were proven to improve end-to-end network performance, without careful design and operation, they might suffer from in-network deadlocks, caused by cyclic buffer dependencies. These dependencies are called credit loops. Although existing credit loops rarely deadlock, when they do they can block large parts of the network. Naive solutions recover from credit loop deadlock by draining buffers and dropping packets. Previous works suggested credit-loop avoidance by central routing algorithms, but these assume specific topologies and are slow to react to failures. In this paper we present distributed algorithm to detect, assure traffic progress and recover from credit loop deadlock for arbitrary network topologies and routing protocols. The algorithm can be implemented over commodity switch hardware, requires negligible additional control bandwidth, and avoids packet loss after the deadlock occurs. We introduce two flavors of the algorithm and discuss their trade-off. We define simple scenario that assures credit loop deadlock to occur and use it to test and analyze the algorithm. In addition, we provide simulation results over 3-level fat-tree network. Last, we describe our prototype implementation over commodity data center switch.

[1]  William J. Dally,et al.  Deadlock-Free Message Routing in Multiprocessor Interconnection Networks , 1987, IEEE Transactions on Computers.

[2]  José Duato,et al.  Adaptive bubble router: a design to improve performance in torus networks , 1999, Proceedings of the 1999 International Conference on Parallel Processing.

[3]  Torsten Hoefler,et al.  Deadlock-Free Oblivious Routing for Arbitrary Topologies , 2011, 2011 IEEE International Parallel & Distributed Processing Symposium.

[4]  Keith D. Underwood,et al.  A Unified Algorithm for Both Randomized Deterministic and Adaptive Routing in Torus Networks , 2011, 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum.

[5]  Michael Burrows,et al.  Autonet: A High-Speed, Self-Configuring Local Area Network Using Point-to-Point Links , 1991, IEEE J. Sel. Areas Commun..

[6]  Olav Lysne,et al.  Layered routing in irregular networks , 2006, IEEE Transactions on Parallel and Distributed Systems.

[7]  MengChu Zhou,et al.  Deadlock Resolution in Computer-Integrated Systems , 2004 .

[8]  Lionel M. Ni,et al.  The turn model for adaptive routing , 1992, ISCA '92.

[9]  Pedro López,et al.  A very efficient distributed deadlock detection mechanism for wormhole networks , 1998, Proceedings 1998 Fourth International Symposium on High-Performance Computer Architecture.

[10]  José Duato,et al.  A New Theory of Deadlock-Free Adaptive Routing in Wormhole Networks , 1993, IEEE Trans. Parallel Distributed Syst..

[11]  D. Zats,et al.  DeTail: reducing the flow completion time tail in datacenter networks , 2012, CCRV.

[12]  Kenneth Steiglitz,et al.  Some Complexity Results in the Design of Deadlock-Free Packet Switching Networks , 1981, SIAM J. Comput..

[13]  Dharma P. Agrawal,et al.  Generic methodologies for deadlock-free routing , 1996, Proceedings of International Conference on Parallel Processing.

[14]  Timothy Mark Pinkston,et al.  An efficient, fully adaptive deadlock recovery scheme: DISHA , 1995, ISCA.

[15]  Thomas E. Anderson,et al.  F10: A Fault-Tolerant Engineered Network , 2013, NSDI.

[16]  Ratul Mahajan,et al.  Consistent updates in software defined networks: On dependencies, loop freedom, and blackholes , 2016, 2016 IFIP Networking Conference (IFIP Networking) and Workshops.