Exploiting global knowledge to achieve self-tuned congestion control for k-ary n-cube networks

Network performance in tightly-coupled multiprocessors typically degrades rapidly beyond network saturation. Consequently, designers must keep a network below its saturation point by reducing the load on the network. Congestion control via source throttling-a common technique to reduce the network load-prevents new packets from entering the network in the presence of congestion. Unfortunately, prior schemes to implement source throttling either lack vital global information about the network to make the correct decision (whether to throttle or not) or depend on specific network parameters, or communication patterns. This paper presents a global-knowledge-based, self-tuned, congestion control technique that prevents saturation at high loads across different communication patterns for k-ary n-cube networks. Our design is composed of two key components. First, we use global information about a network to obtain a timely estimate of network congestion. We compare this estimate to a threshold value to determine when to throttle packet injection. The second component is a self-tuning mechanism that automatically determines appropriate threshold values based on throughput feedback. A combination of these two techniques provides high performance under heavy load, does not penalize performance under light load, and gracefully adapts to changes in communication patterns.

[1]  William J. Dally,et al.  The torus routing chip , 2005, Distributed Computing.

[2]  Sally Floyd,et al.  TCP and explicit congestion notification , 1994, CCRV.

[3]  James R. Larus,et al.  Fine-grain access control for distributed shared memory , 1994, ASPLOS VI.

[4]  Andrew A. Chien,et al.  Compressionless routing: a framework for adaptive and fault-tolerant routing , 1994, ISCA '94.

[5]  Van Jacobson,et al.  Congestion avoidance and control , 1988, SIGCOMM '88.

[6]  Pedro López,et al.  DRIL: dynamically reduced message injection limitation mechanism for wormhole networks , 1998, Proceedings. 1998 International Conference on Parallel Processing (Cat. No.98EX205).

[7]  Pedro López,et al.  On the Reduction of Deadlock Frequency by Limiting Message Injection in Wormhole Networks , 1997, PCRCW.

[8]  Timothy Mark Pinkston,et al.  An efficient, fully adaptive deadlock recovery scheme: DISHA , 1995, ISCA.

[9]  José Duato,et al.  A New Theory of Deadlock-Free Adaptive Routing in Wormhole Networks , 1993, IEEE Trans. Parallel Distributed Syst..

[10]  Gurindar S. Sohi,et al.  The Use of Feedback in Multiprocessors and Its Application to Tree Saturation Control , 1990, IEEE Trans. Parallel Distributed Syst..

[11]  Sudhakar Yalamanchili,et al.  Adaptive routing protocols for hypercube interconnection networks , 1993, Computer.

[12]  Raj Jain Congestion Control and Traffic Management in ATM Networks: Recent Advances and a Survey , 1996, Comput. Networks ISDN Syst..

[13]  Leonard Kleinrock,et al.  Virtual Cut-Through: A New Computer Communication Switching Technique , 1979, Comput. Networks.

[14]  Pedro López,et al.  A simple and efficient mechanism to prevent saturation in wormhole networks , 2000, Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000.

[15]  Gregory F. Pfister,et al.  “Hot spot” contention and combining in multistage interconnection networks , 1985, IEEE Transactions on Computers.

[16]  QUTdN QeO,et al.  Random early detection gateways for congestion avoidance , 1993, TNET.

[17]  K. K. Ramakrishnan,et al.  A binary feedback scheme for congestion avoidance in computer networks with a connectionless network layer , 1988, SIGCOMM '88.

[18]  William J. Dally,et al.  Virtual-channel flow control , 1990, [1990] Proceedings. The 17th Annual International Symposium on Computer Architecture.

[19]  William J. Dally,et al.  Flit-reservation flow control , 2000, Proceedings Sixth International Symposium on High-Performance Computer Architecture. HPCA-6 (Cat. No.PR00550).

[20]  Mike Galles Spider: a high-speed network interconnect , 1997, IEEE Micro.

[21]  Joel S. Emer,et al.  Simultaneous multithreading: multiplying alpha performance , 1999 .

[22]  William J. Dally,et al.  Deadlock-Free Adaptive Routing in Multicomputer Networks Using Virtual Channels , 1993, IEEE Trans. Parallel Distributed Syst..

[23]  Timothy Mark Pinkston,et al.  Characterization of deadlocks in interconnection networks , 1997, Proceedings 11th International Parallel Processing Symposium.

[24]  William J. Dally,et al.  Deadlock-Free Message Routing in Multiprocessor Interconnection Networks , 1987, IEEE Transactions on Computers.

[25]  Steven L. Scott,et al.  Synchronization and communication in the T3E multiprocessor , 1996, ASPLOS VII.

[26]  Kourosh Gharachorloo,et al.  Shasta: a low overhead, software-only approach for supporting fine-grain shared memory , 1996, ASPLOS VII.

[27]  D. Lenoski,et al.  The SGI Origin: A ccnuma Highly Scalable Server , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.

[28]  Lars-Erik Thorelli,et al.  Global reactive congestion control in multicomputer networks , 1998, Proceedings. Fifth International Conference on High Performance Computing (Cat. No. 98EX238).

[29]  Keith Diefendorff,et al.  Power4 focuses on memory bandwidth , 1999 .

[30]  Larry L. Peterson,et al.  TCP Vegas: End to End Congestion Avoidance on a Global Internet , 1995, IEEE J. Sel. Areas Commun..

[31]  Debashis Basak,et al.  Alleviating Consumption Channel Bottleneck in Wormhole-Routed k-ary n-Cube Systems , 1998, IEEE Trans. Parallel Distributed Syst..