Hiding communication latency in reconfigurable message-passing environments

Communication overhead is one of the most important factors affecting the performance of message passing multicomputers. We present evidence (through the analysis of several parallel benchmarks) that there exists communications locality, and that this locality is "structured". We have devised a number of heuristics that can "predict" the target of subsequent communication requests. This technique, can be applied to reconfigurable interconnects to hide the communications latency by reconfiguring the interconnect concurrently to the computation. By comparing the inter-communication computation times of a number of parallel benchmarks with some specific reconfiguration times, we argue that the computation interval can be used to hide the concurrent reconfiguration of the interconnect, and present the performance enhancements of the proposed heuristics.

[1]  Hugo Thienpont,et al.  Free-space reconfigurable optical interconnection based on polarization-switching VCSELs and polarization-selective diffractive optical elements , 1998, Other Conferences.

[2]  Michael W. Haney,et al.  Fundamental geometric advantages of free-space optical interconnects , 1996, Proceedings of Massively Parallel Processing Using Optical Interconnections.

[3]  S C Esener,et al.  Speed and energy analysis of digital interconnections: comparison of on-chip, off-chip, and free-space technologies. , 1998, Applied optics.

[4]  David J. Lilja,et al.  Characterization of Communication Patterns in Message-Passing Parallel Scientific Application Programs , 1998, CANPC.

[5]  Jack Dongarra,et al.  Message-Passing Performance of Various Computers , 1995 .

[6]  Anoop Gupta,et al.  Tolerating Latency Through Software-Controlled Prefetching in Shared-Memory Multiprocessors , 1991, J. Parallel Distributed Comput..

[7]  David H. Bailey,et al.  NAS parallel benchmark results , 1992, Proceedings Supercomputing '92.

[8]  A. Chien,et al.  High Performance Messaging on Workstations: Illinois Fast Messages (FM) for Myrinet , 1995, Proceedings of the IEEE/ACM SC95 Conference.

[9]  Rami G. Melhem,et al.  Compiled Communication for All-Optical TDM Networks , 1996, Proceedings of the 1996 ACM/IEEE Conference on Supercomputing.

[10]  Sudhakar Yalamanchili,et al.  Communications Latency Hiding Techniques for a Reconfigurable Optical Interconnect: Benchmark Studies , 1998 .

[11]  Ahmed Louri,et al.  An optical multi-mesh hypercube: a scalable optical interconnection network for massively parallel computing , 1994 .

[12]  Donald M. Chiarulli,et al.  Predicting Multiprocessor Memory Access Patterns with Learning Models , 1997, ICML.

[13]  H. Bourdin,et al.  A comparative study of one-to-many WDM lightwave interconnection networks for multiprocessors , 1995, Proceedings of Second International Workshop on Massively Parallel Processing Using Optical Interconnections.

[14]  Ian Foster,et al.  Parallel Spectral Transform Shallow Water Model: a runtime-tunable parallel benchmark code , 1994, Proceedings of IEEE Scalable High Performance Computing Conference.

[15]  Tawee Tanbun-Ek,et al.  A systems perspective on digital interconnection technology , 1992 .

[16]  Cécile Germain,et al.  Static Communications in Parallel Scientific Propgrams , 1994, PARLE.

[17]  Chris Hinds,et al.  of the The Superscalar Architecture MC 68060 , 2004 .

[18]  Scott Pakin,et al.  High Performance Messaging on Workstations: Illinois Fast Messages (FM) for Myrinet , 1995, Proceedings of the IEEE/ACM SC95 Conference.

[19]  S. Hioki Construction of Staples in Lattice Gauge Theory on a Parallel Computer , 1996, Parallel Comput..

[20]  Sudhakar Yalamanchili,et al.  Architectural support for reducing communication overhead in multiprocessor interconnection networks , 1997, Proceedings Third International Symposium on High-Performance Computer Architecture.

[21]  Nikitas J. Dimopoulos,et al.  Collective Communications on a Reconfigurable Optical Interconnect , 1997, OPODIS.

[22]  Josep Torrellas,et al.  Speeding up irregular applications in shared-memory multiprocessors: memory binding and group prefetching , 1995, ISCA.

[23]  Ashok V. Krishnamoorthy,et al.  Optically Augmented 3-D Computer: System Technology and Architecture , 1997, J. Parallel Distributed Comput..

[24]  Seth Copen Goldstein,et al.  Active Messages: A Mechanism for Integrated Communication and Computation , 1992, [1992] Proceedings the 19th Annual International Symposium on Computer Architecture.

[25]  Message P Forum,et al.  MPI: A Message-Passing Interface Standard , 1994 .