Message proxies for efficient, protected communication on SMP clusters

This research addresses the problem of providing efficient, protected communication in an SMP cluster without incurring the overhead of system calls or the cost of custom hardware. It analyzes an approach that uses an idle SMP processor to run a message proxy, a communication process that provides protected access to the network. We implement message proxy based communication between a pair of IBM Model G30 SMPs and analyze the resulting overheads. We derive a performance model that shows that cache-miss latency within an SMP influences message proxy performance significantly. Simulations of a suite of ten parallel applications demonstrate that message proxies match the performance of custom hardware for three of the ten applications, and are between 10-30% slower for the other seven applications. A direct cache-update mechanism to reduce cache misses improves the performance of message proxies on communication-intensive programs by 7-25%. We conclude that message proxies provide a viable alternative to custom hardware for protected communication.

[1]  T. von Eicken,et al.  Parallel programming in Split-C , 1993, Supercomputing '93.

[2]  Eric A. Brewer,et al.  Remote queues: exposing message queues for optimization and atomicity , 1995, SPAA '95.

[3]  Anoop Gupta,et al.  The Stanford Dash multiprocessor , 1992, Computer.

[4]  Thorsten von Eicken,et al.  Low-Latency Communication Over ATM Networks Using Active Messages , 1995, IEEE Micro.

[5]  Matthew I. Frank,et al.  UDM: User Direct Messaging for General-Purpose Multiprocessing , 1996 .

[6]  Anoop Gupta,et al.  The SPLASH-2 programs: characterization and methodological considerations , 1995, ISCA.

[7]  Babak Falsafi,et al.  Scheduling communication on an SMP node parallel machine , 1997, Proceedings Third International Symposium on High-Performance Computer Architecture.

[8]  Per Stenström,et al.  Performance evaluation of a cluster-based multiprocessor built from ATM switches and bus-based multiprocessor servers , 1996, Proceedings. Second International Symposium on High-Performance Computer Architecture.

[9]  Henry M. Levy,et al.  Separating data and control transfer in distributed operating systems , 1994, ASPLOS VI.

[10]  Richard P. Martin,et al.  HPAM: an active message layer for a network of hp workstations , 1994, Symposium Record Hot Interconnects II.

[11]  Cathy May,et al.  The PowerPC Architecture: A Specification for a New Family of RISC Processors , 1994 .

[12]  S.K. Reinhardt,et al.  Decoupled Hardware Support for Distributed Shared Memory , 1996, 23rd Annual International Symposium on Computer Architecture (ISCA'96).

[13]  Chris J. Scheiman,et al.  Experience with active messages on the Meiko CS-2 , 1995, Proceedings of 9th International Parallel Processing Symposium.

[14]  Kirk L. Johnson,et al.  CRL: high-performance all-software distributed shared memory , 1995, SOSP.

[15]  Kai Li,et al.  Retrospective: virtual memory mapped network interface for the SHRIMP multicomputer , 1994, ISCA '98.

[16]  A. Chien,et al.  High Performance Messaging on Workstations: Illinois Fast Messages (FM) for Myrinet , 1995, Proceedings of the IEEE/ACM SC95 Conference.

[17]  Y WangRandolph,et al.  Evaluation of architectural support for global address-based communication in large-scale parallel machines , 1996 .

[18]  James C. Hoe,et al.  START-NG: Delivering Seamless Parallel Computing , 1995, Euro-Par.

[19]  R. Gillett,et al.  Overview of memory channel network for PCI , 1996, COMPCON '96. Technologies for the Information Superhighway Digest of Papers.

[20]  Message Passing Interface Forum MPI: A message - passing interface standard , 1994 .

[21]  Charles L. Seitz,et al.  Myrinet: A Gigabit-per-Second Local Area Network , 1995, IEEE Micro.

[22]  Edith Schonberg,et al.  Static analysis to reduce synchronization costs in data-parallel programs , 1996, POPL '96.

[23]  Thorsten von Eicken,et al.  Low-latency communication over ATM networks using active messages , 1994, Symposium Record Hot Interconnects II.

[24]  P. Pierce,et al.  The Paragon implementation of the NX message passing interface , 1994, Proceedings of IEEE Scalable High Performance Computing Conference.

[25]  Jon Beecroft,et al.  Meiko CS-2 Interconnect Elan-Elite Design , 1994, Parallel Comput..

[26]  Andrea C. Arpaci-Dusseau,et al.  Parallel programming in Split-C , 1993, Supercomputing '93. Proceedings.