Modeling Universal Globally Adaptive Load-Balanced Routing

Universal globally adaptive load-balanced (UGAL) routing has been proposed for various interconnection networks and has been deployed in a number of current-generation supercomputers. Although UGAL-based schemes have been extensively studied, most existing results are based on either simulation or measurement. Without a theoretical understanding of UGAL, multiple questions remain: For which traffic patterns is UGAL most suited? In addition, what determines the performance of the UGAL-based scheme on a particular network configuration? In this work, we develop a set of throughput models for UGALbased on linear programming. We show that the throughput models are valid across the torus, Dragonfly, and Slim Fly network topologies. Finally, we identify a robust model that can accurately and efficiently predict UGAL throughput for a set of representative traffic patterns across different topologies. Our models not only provide a mechanism to predict UGAL performance on large-scale interconnection networks but also reveal the inner working of UGAL and further our understanding of this type of routing.

[1]  Torsten Hoefler,et al.  Slim Fly: A Cost Effective Low-Diameter Network Topology , 2014, SC14: International Conference for High Performance Computing, Networking, Storage and Analysis.

[2]  Mike Higgins,et al.  Cray Cascade: A scalable HPC system based on a Dragonfly network , 2012, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis.

[3]  Mateo Valero,et al.  Contention-Based Nonminimal Adaptive Routing in High-Radix Networks , 2015, 2015 IEEE International Parallel and Distributed Processing Symposium.

[4]  Xin Yuan,et al.  TPR: Traffic Pattern-Based Adaptive Routing for Dragonfly Networks , 2018, IEEE Transactions on Multi-Scale Computing Systems.

[5]  Xin Yuan,et al.  Oblivious routing in fat-tree based system area networks with uncertain traffic demands , 2009, TNET.

[6]  Onur Mutlu,et al.  Express Cube Topologies for on-Chip Interconnects , 2009, 2009 IEEE 15th International Symposium on High Performance Computer Architecture.

[7]  Ramón Beivide,et al.  Projective Networks: Topologies for Large Parallel Computer Systems , 2015, IEEE Transactions on Parallel and Distributed Systems.

[8]  Mateo Valero,et al.  OFAR-CM: Efficient Dragonfly Networks with Simple Congestion Management , 2013, 2013 IEEE 21st Annual Symposium on High-Performance Interconnects.

[9]  William J. Dally,et al.  Technology-Driven, Highly-Scalable Dragonfly Topology , 2008, 2008 International Symposium on Computer Architecture.

[10]  Xin Yuan,et al.  Random Regular Graph and Generalized De Bruijn Graph with k-Shortest Path Routing , 2018, IEEE Trans. Parallel Distributed Syst..

[11]  Mateo Valero,et al.  On-the-Fly Adaptive Routing in High-Radix Hierarchical Networks , 2012, 2012 41st International Conference on Parallel Processing.

[12]  Leslie G. Valiant,et al.  A Scheme for Fast Parallel Communication , 1982, SIAM J. Comput..

[13]  Leslie G. Valiant,et al.  Universal schemes for parallel communication , 1981, STOC '81.

[14]  Sangeetha Abdu Jyothi,et al.  Measuring and Understanding Throughput of Network Topologies , 2014, SC16: International Conference for High Performance Computing, Networking, Storage and Analysis.

[15]  Zhou Tong,et al.  A comparative study of SDN and adaptive routing on dragonfly networks , 2017, SC.

[16]  Nan Jiang,et al.  A detailed and flexible cycle-accurate Network-on-Chip simulator , 2013, 2013 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).

[17]  Xin Yuan,et al.  LFTI: A New Performance Metric for Assessing Interconnect Designs for Extreme-Scale HPC Systems , 2014, 2014 IEEE 28th International Parallel and Distributed Processing Symposium.

[18]  John Kim,et al.  Overcoming far-end congestion in large-scale networks , 2015, 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA).

[19]  Mateo Valero,et al.  Efficient Routing Mechanisms for Dragonfly Networks , 2013, 2013 42nd International Conference on Parallel Processing.

[20]  Nimrod Megiddo,et al.  Advances in Economic Theory: On the complexity of linear programming , 1987 .

[21]  Ankit Singla,et al.  Jellyfish: Networking Data Centers Randomly , 2011, NSDI.

[22]  Laxmikant V. Kalé,et al.  Maximizing Throughput on a Dragonfly Network , 2014, SC14: International Conference for High Performance Computing, Networking, Storage and Analysis.

[23]  Farhad Shahrokhi,et al.  The maximum concurrent flow problem , 1990, JACM.

[24]  Xin Yuan,et al.  A new routing scheme for jellyfish and its performance with HPC workloads , 2013, 2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[25]  Nan Jiang,et al.  Indirect adaptive routing on large scale interconnection networks , 2009, ISCA '09.

[26]  Xin Yuan,et al.  Load-Balanced Slim Fly Networks , 2018, ICPP.

[27]  Xin Yuan,et al.  On Folded-Clos Networks with Deterministic Single-Path Routing , 2016, ACM Trans. Parallel Comput..

[28]  Ankit Singla,et al.  High Throughput Data Center Topology Design , 2013, NSDI.

[29]  Xin Yuan,et al.  Rapid Calculation of Max-Min Fair Rates for Multi-Commodity Flows in Fat-Tree Networks , 2015, IEEE Transactions on Parallel and Distributed Systems.

[30]  Mateo Valero,et al.  Oblivious routing schemes in extended generalized Fat Tree networks , 2009, 2009 IEEE International Conference on Cluster Computing and Workshops.