Performance evaluation of modified hierarchical ring by exploiting link utilization and memory access locality

In multiprocessor systems, the interconnection network design is critical for overall system performance. In this paper, we show the modified hierarchical ring network, called as Torus ring, and deeply evaluate the performance of the Torus ring. The Torus ring has an advantage over the hierarchical ring when the destination of network packet is the adjacent local ring, especially to the backward direction, by exploiting the memory access locality. Further, the performance gain of the Torus ring is expected to increase, due to the spatial locality of the applications and the efficient utilization of link bandwidth. In the simulation results, the overall execution time of Torus ring is reduced, up to 4.5% with moderate ring utilization ratios, compared to the hierarchical ring

[1]  Anoop Gupta,et al.  SPLASH: Stanford parallel applications for shared-memory , 1992, CARN.

[2]  T. Lovett,et al.  STiNG: A CC-NUMA Computer System for the Commercial Marketplace , 1996, 23rd Annual International Symposium on Computer Architecture (ISCA'96).

[3]  Michael Stumm,et al.  Performance issues in the design of hierarchical-ring and direct networks for shared-memory multiprocessors , 1998 .

[4]  Michael Stumm,et al.  Hector: a hierarchically structured shared-memory multiprocessor , 1991, Computer.

[5]  Hong Jiang,et al.  Performance and configuration of hierarchical ring networks for multiprocessors , 1997, Proceedings of the 1997 International Conference on Parallel Processing (Cat. No.97TB100162).

[6]  Anoop Gupta,et al.  The DASH prototype: implementation and performance , 1992, ISCA '92.

[7]  William J. Dally,et al.  Virtual-channel flow control , 1990, [1990] Proceedings. The 17th Annual International Symposium on Computer Architecture.

[8]  Michael Stumm,et al.  A performance comparison of hierarchical ring- and mesh-connected multiprocessor networks , 1997, Proceedings Third International Symposium on High-Performance Computer Architecture.

[9]  Michael Stumm,et al.  Performance Evaluation of Hierarchical Ring-Based Shared Memory Multiprocessors , 1994, IEEE Trans. Computers.

[10]  Hong Jiang,et al.  Comparison of Mesh and Hierarchical Networks for Multiprocessors , 1994, 1994 International Conference on Parallel Processing Vol. 1.

[11]  Guy Lemieux,et al.  Design and implementation of the NUMAchine multiprocessor , 1998, Proceedings 1998 Design and Automation Conference. 35th DAC. (Cat. No.98CH36175).

[12]  Michel Dubois,et al.  Performance Evaluation of the Slotted Ring Multiprocessor , 1995, IEEE Trans. Computers.

[13]  Chu Shik Jhon,et al.  Torus Ring: Improving Interconnection Network Performance by Modifying Hierarchical Ring , 2005, IEICE Trans. Inf. Syst..

[14]  Michael Stumm,et al.  On topology and bisection bandwidth of hierarchical-ring networks for shared-memory multiprocessors , 1998, Proceedings. Fifth International Conference on High Performance Computing (Cat. No. 98EX238).

[15]  Guy Lemieux,et al.  The NUMAchine multiprocessor , 2000, Proceedings 2000 International Conference on Parallel Processing.

[16]  Hong Jiang,et al.  Hierarchical Ring Network Configuration and Performance Modeling , 2001, IEEE Trans. Computers.

[17]  Anoop Gupta,et al.  The SPLASH-2 programs: characterization and methodological considerations , 1995, ISCA.