Discovery and Routing of Degraded Fat-Trees

The fat-tree topology has become a popular choice for InfiniBand enterprise systems due to its deadlock freedom, fault-tolerance and full bisection bandwidth. In the HPC domain, InfiniBand fabric is used in almost 42% of the systems on the latest Top 500 list, and many of those systems are based on the fat-tree topology. Despite the popularity of the fat-tree topology, little research has been done to compare the behavior of InfiniBand routing algorithms on degraded fat-tree topologies. In this paper, we identify the weaknesses of the current fat-tree routing and propose enhancements that liberalize the restrictions imposed on the routed fabric. Furthermore, we present a thorough analysis of non-proprietary routing algorithms that are implemented in the InfiniBand Open Subnet Manager. Our results show that even though the performance of a fat-tree routed network deteriorates predictably with the number of failed links, fat-tree routing algorithm is still the best choice for severely degraded fat-tree fabrics.

[1]  Torsten Hoefler,et al.  Multistage switches are not crossbars: Effects of static routing in high-performance networks , 2008, 2008 IEEE International Conference on Cluster Computing.

[2]  José Duato,et al.  Dynamic Fault Tolerance in Fat Trees , 2011, IEEE Transactions on Computers.

[3]  Aurelio Bermúdez,et al.  Fast routing computation on InfiniBand networks , 2006, IEEE Transactions on Parallel and Distributed Systems.

[4]  José Duato,et al.  On the Infiniband subnet discovery process , 2003, 2003 Proceedings IEEE International Conference on Cluster Computing.

[5]  Pedro López,et al.  Deterministic versus Adaptive Routing in Fat-Trees , 2007, 2007 IEEE International Parallel and Distributed Processing Symposium.

[6]  Olav Lysne,et al.  A Dynamic Fault-tolerant Routing Algorithm for Fat-trees , 2005, PDPTA.

[7]  Olav Lysne,et al.  Fault tolerance with shortest paths in regular and irregular networks , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing.

[8]  Michael Lang,et al.  Optimized InfiniBandTM fat‐tree routing for shift all‐to‐all communication patterns , 2010, Concurr. Comput. Pract. Exp..

[9]  José Duato,et al.  Dynamic Fault Tolerance with Misrouting in Fat Trees , 2006, 2006 International Conference on Parallel Processing (ICPP'06).

[10]  Joan Jacobs,et al.  D-Mod-K Routing Providing Non-Blocking Traffic for Shift Permutations on Real Life Fat Trees , 2010 .

[11]  Fabrizio Petrini,et al.  k-ary n-trees: high performance networks for massively parallel architectures , 1997, Proceedings 11th International Parallel Processing Symposium.

[12]  Olav Lysne,et al.  Layered shortest path (LASH) routing in irregular system area networks , 2002, Proceedings 16th International Parallel and Distributed Processing Symposium.

[13]  Sven-Arne Reinemo,et al.  InfiniBand congestion control: modelling and validation , 2011, SimuTools.

[14]  Charles E. Leiserson,et al.  Fat-trees: Universal networks for hardware-efficient supercomputing , 1985, IEEE Transactions on Computers.

[15]  José Duato,et al.  Evaluation of a subnet management mechanism for InfiniBand networks , 2003, 2003 International Conference on Parallel Processing, 2003. Proceedings..

[16]  Yeh-Ching Chung,et al.  A multiple LID routing scheme for fat-tree-based InfiniBand networks , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..

[17]  Torsten Hoefler,et al.  Deadlock-Free Oblivious Routing for Arbitrary Topologies , 2011, 2011 IEEE International Parallel & Distributed Processing Symposium.

[18]  Hans Werner Meuer,et al.  Top500 Supercomputer Sites , 1997 .

[19]  Mohan Kumar,et al.  On generalized fat trees , 1995, Proceedings of 9th International Parallel Processing Symposium.

[20]  Darren J. Kerbyson,et al.  Optimized InfiniBand TM fat-tree routing for shift all-to-all communication patterns , 2010, ISC 2010.

[21]  Amith R. Mamidala,et al.  Performance modeling of subnet management on fat tree InfiniBand networks using OpenSM , 2005, 19th IEEE International Parallel and Distributed Processing Symposium.

[22]  William J. Dally,et al.  Virtual-channel flow control , 1990, [1990] Proceedings. The 17th Annual International Symposium on Computer Architecture.

[23]  Olav Lysne,et al.  vFtree - A Fat-Tree Routing Algorithm Using Virtual Lanes to Alleviate Congestion , 2011, 2011 IEEE International Parallel & Distributed Processing Symposium.

[24]  Antonio Robles,et al.  A Survey and Evaluation of Topology-Agnostic Deterministic Routing Algorithms , 2012, IEEE Transactions on Parallel and Distributed Systems.

[25]  Mateo Valero,et al.  Oblivious routing schemes in extended generalized Fat Tree networks , 2009, 2009 IEEE International Conference on Cluster Computing and Workshops.

[26]  Sudhakar Yalamanchili,et al.  Interconnection Networks: An Engineering Approach , 2002 .

[27]  Sven-Arne Reinemo,et al.  Achieving Predictable High Performance in Imbalanced Fat Trees , 2010, 2010 IEEE 16th International Conference on Parallel and Distributed Systems.