A new routing scheme for jellyfish and its performance with HPC workloads

The jellyfish topology where switches are connected using a random graph has recently been proposed for large scale data-center networks. It has been shown to offer higher bisection bandwidth and better permutation throughput than the corresponding fat-tree topology with a similar cost. In this work, we propose a new routing scheme for jellyfish that out-performs existing schemes by more effectively exploiting the path diversity, and comprehensively compare the performance of jellyfish and fat-tree topologies with HPC workloads. The results indicate that both jellyfish and fat-tree topologies offer comparable high performance for HPC workloads on systems that can be realized by 3-level fat-trees using the current technology and the corresponding jellyfish topologies with similar costs. Fat-trees are more effective for smaller systems while jellyfish is more scalable.

[1]  José Duato,et al.  Deadlock-Free Routing in InfiniBand through Destination Renaming , 2001 .

[2]  Mark Handley,et al.  Design, Implementation and Evaluation of Congestion Control for Multipath TCP , 2011, NSDI.

[3]  Ankit Singla,et al.  Jellyfish: Networking Data Centers Randomly , 2011, NSDI.

[4]  Béla Bollobás,et al.  The Isoperimetric Number of Random Regular Graphs , 1988, Eur. J. Comb..

[5]  Xin Yuan,et al.  Limited Multi-path Routing on Extended Generalized Fat-trees , 2012, 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum.

[6]  Dimitri P. Bertsekas,et al.  Data Networks , 1986 .

[7]  Mohan Kumar,et al.  On generalized fat trees , 1995, Proceedings of 9th International Parallel Processing Symposium.

[8]  Torsten Hoefler,et al.  Multistage switches are not crossbars: Effects of static routing in high-performance networks , 2008, 2008 IEEE International Conference on Cluster Computing.

[9]  Darren J. Kerbyson,et al.  Automatic Identification of Application Communication Patterns via Templates , 2005, ISCA PDCS.

[10]  R. Srikant,et al.  Multi-Path TCP: A Joint Congestion Control and Routing Scheme to Exploit Path Diversity in the Internet , 2006, IEEE/ACM Transactions on Networking.

[11]  David A. Patterson,et al.  Computer Architecture: A Quantitative Approach , 1969 .

[12]  Christian E. Hopps,et al.  Analysis of an Equal-Cost Multi-Path Algorithm , 2000, RFC.

[13]  Fabio Checconi,et al.  Characterization of the Communication Patterns of Scientific Applications on Blue Gene/P , 2011, 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum.

[14]  Béla Bollobás,et al.  The diameter of random regular graphs , 1982, Comb..

[15]  William J. Dally,et al.  Principles and Practices of Interconnection Networks , 2004 .