A cost-efficient congestion management methodology for fat-trees using traffic pattern detection

Interconnection networks have a great impact on the performance of parallel systems. These networks provide the communication mechanism and framework needed by parallel applications. One such important network is fat-tree. Selection functions were shown to have a great impact on the performance of fat-trees. Selection functions perform differently under certain traffic patterns. The stage and destination priority (SADP) selection function was shown to have better performance in case of uniform traffic while the stage and origin priority (SAOP) selection function was shown to perform better in case of hot-spot traffic. In this paper, we propose a cost-efficient congestion management mechanism for fat-trees that choose a certain selection function for certain traffic pattern. The mechanism has the ability to detect the current traffic pattern and switch to a certain selection function that is proved to give better performance under the detected traffic pattern. This directly decreases the congestion in the network. First, we analyze the hot-spot traffic in fat-trees if SADP selection function is used. We derive a condition for the existence of hot-spot traffic if SADP function is used. We give an implementation for detecting this condition. Once this condition is detected, the network is forced to switch to use the SAOP selection function. Then, we use the analysis of SAOP to derive a condition to detect that a non hot-spot traffic exists in the fat-tree. We give an implementation for detecting this condition. In turn, we switch back to the SADP selection function. We use synthetic workloads to show the accuracy of the proposed mechanism for detecting the hot-spot traffic in the network. We show that the proposed mechanism incurs a constant number of bits per physical link as an overhead. Finally, we compare the proposed mechanism with other techniques.

[1]  Olav Lysne,et al.  vFtree - A Fat-Tree Routing Algorithm Using Virtual Lanes to Alleviate Congestion , 2011, 2011 IEEE International Parallel & Distributed Processing Symposium.

[2]  Arda Yurdakul,et al.  A dynamically reconfigurable communication architecture for multicore embedded systems , 2012, J. Syst. Archit..

[3]  Hatem M. El-Boghdadi,et al.  On the influence of selection function on the performance of fat-trees under hot-spot traffic , 2011, 2011 9th IEEE/ACS International Conference on Computer Systems and Applications (AICCSA).

[4]  José Duato,et al.  OBQA: Smart and cost-efficient queue scheme for Head-of-Line blocking elimination in fat-trees , 2011, J. Parallel Distributed Comput..

[5]  Sape Mullender,et al.  Distributed systems , 1989 .

[6]  Zhen Yang,et al.  Hotspot Avoidance for P2P Streaming Distribution Application: A Game Theoretic Approach , 2009, IEEE Transactions on Parallel and Distributed Systems.

[7]  José Duato,et al.  A new scalable and cost-effective congestion management strategy for lossless multistage interconnection networks , 2005, 11th International Symposium on High-Performance Computer Architecture.

[8]  Mohamed Ould-Khaoua,et al.  A queueing model for predicting message latency in uni-directional k-ary n-cubes with deterministic routing and non-uniform traffic , 2007, Cluster Computing.

[9]  Hatem M. El-Boghdadi,et al.  A Methodology for Easing the Congestion in Fat-trees Using Traffic Pattern Detection , 2012, 2012 IEEE 10th International Symposium on Parallel and Distributed Processing with Applications.

[10]  Pedro López,et al.  Deterministic versus Adaptive Routing in Fat-Trees , 2007, 2007 IEEE International Parallel and Distributed Processing Symposium.

[11]  Youngsong Mun,et al.  Design and performance analysis of the Practical Fat Tree Network using a butterfly network , 1997, J. Syst. Archit..

[12]  Sudhakar Yalamanchili,et al.  Interconnection Networks: An Engineering Approach , 2002 .

[13]  Antonio Robles,et al.  A Scalable and Early Congestion Management Mechanism for MINs , 2010, 2010 18th Euromicro Conference on Parallel, Distributed and Network-based Processing.

[14]  Hamid R. Arabnia,et al.  The REFINE Multiprocessor - Theoretical Properties and Algorithms , 1995, Parallel Comput..

[15]  Sudhakar Yalamanchili,et al.  Interconnection Networks , 2011, Encyclopedia of Parallel Computing.

[16]  Darren J. Kerbyson,et al.  Optimized InfiniBand TM fat-tree routing for shift all-to-all communication patterns , 2010, ISC 2010.

[17]  Pedro López,et al.  On the Influence of the Selection Function on the Performance of Fat-Trees , 2006, Euro-Par.

[18]  Tomás Lang,et al.  Nonuniform Traffic Spots (NUTS) in Multistage Interconnection Networks , 1990, J. Parallel Distributed Comput..

[19]  José Duato,et al.  A new proposal to deal with congestion in InfiniBand-based fat-trees , 2014, J. Parallel Distributed Comput..

[20]  Gennaro Della Vecchia,et al.  Parallel, distributed and network-based processing , 2008, J. Syst. Archit..

[21]  Nicola Santoro,et al.  Labelling and Implicit Routing in Networks , 1985, Computer/law journal.

[22]  Olav Lysne,et al.  Exploring the Scope of the InfiniBand Congestion Control Mechanism , 2012, 2012 IEEE 26th International Parallel and Distributed Processing Symposium.

[23]  André DeHon Fat-Tree Routing for Transit , 1990 .