An analysis of VI Architecture primitives in support of parallel and distributed communication

We present the results of a detailed study of the Virtual Interface (VI) paradigm as a communication foundation for a distributed computing environment. Using Active Messages and the Split‐C global memory model, we analyze the inherent costs of using VI primitives to implement these high‐level communication abstractions. We demonstrate a minimum mapping cost (i.e. the host processing required to map one abstraction to a lower abstraction) of 5.4 μs for both Active Messages and Split‐C using four‐way 550 MHz Pentium III SMPs and the Myrinet network. We break down this cost to the use of individual VI primitives in supporting flow control, buffer management and event processing and identify the completion queue as the source of the highest overhead. Bulk transfer performance plateaus at 44 Mbytes/s for both implementations are due to the addition of fragmentation requirements. Based on this analysis, we present the implications for the VI successor, Infiniband. Copyright © 2002 John Wiley & Sons, Ltd.

[1]  Richard P. Martin,et al.  Assessing Fast Network Interfaces , 1996, IEEE Micro.

[2]  Dhabaleswar K. Panda,et al.  Efficient virtual interface architecture (VIA) support for the IBM SP switch-connected NT clusters , 2000, Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000.

[3]  Dhabaleswar K. Panda,et al.  Comparison and Evaluation of Design Choices for Implementing the Virtual Interface Architecture (VIA) , 2000, CANPC.

[4]  Scott Pakin,et al.  Fast messages: efficient, portable communication for workstation clusters and MPPs , 1997, IEEE Concurrency.

[5]  Pankaj Mehra,et al.  The record-breaking terabyte sort on a compaq cluster , 1999 .

[6]  John K. Bennett,et al.  WSDLite: a lightweight alternative to windows sockets direct path , 2000 .

[7]  Shubhendu S. Mukherjee,et al.  Coherent Network Interfaces for Fine-Grain Communication , 1996, 23rd Annual International Symposium on Computer Architecture (ISCA'96).

[8]  Li Li,et al.  High-performance distributed objects over system area networks , 1999 .

[9]  James R. Larus,et al.  Where is time spent in message-passing and shared-memory programs? , 1994, ASPLOS VI.

[10]  Krishna Kant,et al.  Server Capacity Planning for Web Traffic Workload , 1999, IEEE Trans. Knowl. Data Eng..

[11]  Seth Copen Goldstein,et al.  Active Messages: A Mechanism for Integrated Communication and Computation , 1992, [1992] Proceedings the 19th Annual International Symposium on Computer Architecture.

[12]  Richard P. Martin,et al.  LogP Performance Assessment of Fast Network Interfaces , 1995 .

[13]  Andreas Savva,et al.  Smart Cluster Network (SCnet): design of high performance communication system for SAN , 1999, ICWC 99. IEEE Computer Society International Workshop on Cluster Computing.

[14]  Seth Copen Goldstein,et al.  Active messages: a mechanism for integrating communication and computation , 1998, ISCA '98.

[15]  Thorsten von Eicken,et al.  Evolution of the Virtual Interface Architecture , 1998, Computer.

[16]  Andrew A. Chien,et al.  Software overhead in messaging layers: where does the time go? , 1994, ASPLOS VI.

[17]  Kevin J. Nowka,et al.  Designing for a gigahertz [guTS integer processor] , 1998, IEEE Micro.

[18]  Calton Pu,et al.  High Performance Sockets and RPC over Virtual Interface (VI) Architecture , 1999, CANPC.

[19]  Monica Reggiani,et al.  Design of a VIA based communication protocol for LAM/MPI suite , 2001, Proceedings Ninth Euromicro Workshop on Parallel and Distributed Processing.

[20]  Bernard Tourancheau,et al.  BIP: A New Protocol Designed for High Performance Networking on Myrinet , 1998, IPPS/SPDP Workshops.

[21]  Ramesh Subramonian,et al.  LogP: towards a realistic model of parallel computation , 1993, PPOPP '93.

[22]  Calton Pu,et al.  Harnessing user-level networking architectures for distributed object computing over high-speed networks , 1998 .

[23]  Charles L. Seitz,et al.  Myrinet: A Gigabit-per-Second Local Area Network , 1995, IEEE Micro.

[24]  Chris J. Scheiman,et al.  Evaluation of architectural support for global address-based communication in large-scale parallel machines , 1996, ASPLOS VII.

[25]  David E. Culler,et al.  An Implementation and Analysis of the Virtual Interface Architecture , 1998, Proceedings of the IEEE/ACM SC98 Conference.

[26]  Andrea C. Arpaci-Dusseau,et al.  Parallel programming in Split-C , 1993, Supercomputing '93. Proceedings.

[27]  Richard P. Martin,et al.  HPAM: an active message layer for a network of hp workstations , 1994, Symposium Record Hot Interconnects II.

[28]  Ravishankar K. Iyer,et al.  A software multilevel fault injection mechanism: case study evaluating the Virtual Interface Architecture , 1999, Proceedings of the 18th IEEE Symposium on Reliable Distributed Systems.

[29]  David E. Culler,et al.  Design challenges of virtual networks: fast, general-purpose communication , 1999, PPoPP '99.

[30]  Alan Heirich,et al.  ServerNet-II: a Reliable Interconnect for Scalable High Performance Cluster Computing , 1998 .

[31]  Thorsten von Eicken,et al.  U-Net: a user-level network interface for parallel and distributed computing , 1995, SOSP.

[32]  Liviu Iftode,et al.  Software support for virtual memory-mapped communication , 1996, Proceedings of International Conference on Parallel Processing.

[33]  Greg J. Regnier,et al.  The Virtual Interface Architecture , 2002, IEEE Micro.

[34]  David E. Culler,et al.  Millennium sort: a cluster-based application for windows NT using DCOM, river primitives and the virtual interface architecture , 1999 .

[35]  Y WangRandolph,et al.  Evaluation of architectural support for global address-based communication in large-scale parallel machines , 1996 .

[36]  John K. Bennett,et al.  Realizing the performance potential of the virtual interface architecture , 1999, ICS '99.