An Efficient Label Routing on High-Radix Interconnection Networks

Cost-effective adaptive routing has a significant impact on overall performance for high-radix hierarchical topologies, such as Dragonfly, which achieve a lower network diameter than traditional topologies, Torus and Fat tree, but exhibit a lower degree of adaptiveness for shortest-path rout- ing. Existing adaptive routing methods for those hierarchical topologies improve the adaptiveness by increasing path length, i.e. local or global adaptive routing, and thus suffer from complex and costly deadlock avoidance. This work aims to maximize the routing adaptiveness at the minimum cost of deadlock avoidance. We propose a label routing method for high-radix hierarchical networks. This label routing utilizes a co-design methodology and coordinates the two pipelines, input queue and routing computation, in the router microarchitec- ture. Packets in the input buffer are labeled by our routing algorithm depending on network states. We reorganize the input buffer and develop a label routing algorithm, named Green-Red Routing, GRR. GRR relaxes the requirement of using virtual channels to eliminate routing deadlock, and mitigates buffer resources dedicated to deadlock avoidance. GRR manages the buffer resources and balance its utilization elaborately, and achieve fully adaptive routing efficiently. We conduct extensive experiments to evaluate the performance of GRR on Dragonfly and compare it with state-of-the-art works. The results show that GRR achieves 10%–35% higher performance than existing routing algorithms under most traffic patterns.

[1]  Xiaowei Liu,et al.  Deadlock-Free Broadcast Routing in Dragonfly Networks without Virtual Channels , 2016, IEEE Transactions on Parallel and Distributed Systems.

[2]  Torsten Hoefler,et al.  The PERCS High-Performance Interconnect , 2010, 2010 18th IEEE Symposium on High Performance Interconnects.

[3]  Antonio Robles,et al.  A Survey and Evaluation of Topology-Agnostic Deterministic Routing Algorithms , 2012, IEEE Transactions on Parallel and Distributed Systems.

[4]  José Duato,et al.  A New Theory of Deadlock-Free Adaptive Routing in Wormhole Networks , 1993, IEEE Trans. Parallel Distributed Syst..

[5]  Timothy Mark Pinkston,et al.  A Formal Model of Message Blocking and Deadlock Resolution in Interconnection Networks , 2000, IEEE Trans. Parallel Distributed Syst..

[6]  Nan Jiang,et al.  Indirect adaptive routing on large scale interconnection networks , 2009, ISCA '09.

[7]  Xin Yuan,et al.  Traffic Pattern-Based Adaptive Routing for Intra-Group Communication in Dragonfly Networks , 2016, 2016 IEEE 24th Annual Symposium on High-Performance Interconnects (HOTI).

[8]  Mateo Valero,et al.  OFAR-CM: Efficient Dragonfly Networks with Simple Congestion Management , 2013, 2013 IEEE 21st Annual Symposium on High-Performance Interconnects.

[9]  Xin Yuan,et al.  A new routing scheme for jellyfish and its performance with HPC workloads , 2013, 2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[10]  Nan Jiang,et al.  A detailed and flexible cycle-accurate Network-on-Chip simulator , 2013, 2013 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).

[11]  Ramón Beivide,et al.  Topological Characterization of Hamming and Dragonfly Networks and Its Implications on Routing , 2014, ACM Trans. Archit. Code Optim..

[12]  William J. Dally,et al.  Technology-Driven, Highly-Scalable Dragonfly Topology , 2008, 2008 International Symposium on Computer Architecture.

[13]  Torsten Hoefler,et al.  Slim Fly: A Cost Effective Low-Diameter Network Topology , 2014, SC14: International Conference for High Performance Computing, Networking, Storage and Analysis.

[14]  Lionel M. Ni,et al.  The Turn Model for Adaptive Routing , 1992, [1992] Proceedings the 19th Annual International Symposium on Computer Architecture.

[15]  Charles E. Leiserson,et al.  Fat-trees: Universal networks for hardware-efficient supercomputing , 1985, IEEE Transactions on Computers.

[16]  José Duato,et al.  Achieving balanced buffer utilization with a proper co-design of flow control and routing algorithm , 2014, 2014 Eighth IEEE/ACM International Symposium on Networks-on-Chip (NoCS).

[17]  William J. Dally,et al.  Principles and Practices of Interconnection Networks , 2004 .

[18]  John Kim,et al.  Overcoming far-end congestion in large-scale networks , 2015, 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA).

[19]  Mateo Valero,et al.  Efficient Routing Mechanisms for Dragonfly Networks , 2013, 2013 42nd International Conference on Parallel Processing.

[20]  Mateo Valero,et al.  On-the-Fly Adaptive Routing in High-Radix Hierarchical Networks , 2012, 2012 41st International Conference on Parallel Processing.