An adaptive communication system for heterogeneous network computing

In this paper, we present an architecture of an AdaptiveCommunication System (ACS) that provides applicationswith programmable communication, control, andmanagement services that can be adopted dynamicallyto maximize application performance at runtime. ACSsupports adaptive and scalable communication servicesthat select the appropriate multicast/broadcast algorithmsfor a given class of applications. These algorithms takeinto consideration both the application requirements andthe load of computing and communication systems.We overview the ACS architecture and then describe ourapproach to implement the ACS group communication services.We introduce two procedures (Resource Aware procedureand Application Aware procedure) to build the appropriatemulticast tree that takes into consideration both thecharacteristics and load conditions of machines as well asthe group communication patterns of a given application.We develop analytical techniques and new metric measuresto characterize and quantify the performance of a multicasttree. We also present our preliminary performance resultsthat show significant performance gain can be achievedfrom using ACS multicast algorithms.

[1]  Andrew S. Grimshaw,et al.  Legion-a view from 50,000 feet , 1996, Proceedings of 5th IEEE International Symposium on High Performance Distributed Computing.

[2]  Viktor K. Prasanna,et al.  Efficient collective communication in distributed heterogeneous systems , 2003, J. Parallel Distributed Comput..

[3]  Larry L. Peterson,et al.  Experiences with a high-speed network adaptor: a software perspective , 1994 .

[4]  Hong Xu,et al.  Unicast-Based Multicast Communication in Wormhole-Routed Networks , 1994, IEEE Trans. Parallel Distributed Syst..

[5]  Ian T. Foster,et al.  Globus: a Metacomputing Infrastructure Toolkit , 1997, Int. J. High Perform. Comput. Appl..

[6]  Ewing L. Lusk,et al.  Monitors, Messages, and Clusters: The p4 Parallel Programming System , 1994, Parallel Comput..

[7]  Charles L. Seitz,et al.  Myrinet: A Gigabit-per-Second Local Area Network , 1995, IEEE Micro.

[8]  David K. Gifford,et al.  Weighted voting for replicated data , 1979, SOSP '79.

[9]  Viktor K. Prasanna,et al.  Efficient collective communication in distributed heterogeneous systems , 1999, Proceedings. 19th IEEE International Conference on Distributed Computing Systems (Cat. No.99CB37003).

[10]  Dhabaleswar K. Panda,et al.  Efficient collective communication on heterogeneous networks of workstations , 1998, Proceedings. 1998 International Conference on Parallel Processing (Cat. No.98EX205).

[11]  Matthew Haines,et al.  On the design of Chant: a talking threads package , 1994, Proceedings of Supercomputing '94.

[12]  Ian T. Foster,et al.  The Nexus Approach to Integrating Multithreading and Communication , 1996, J. Parallel Distributed Comput..

[13]  Thorsten von Eicken,et al.  U-Net: a user-level network interface for parallel and distributed computing , 1995, SOSP.

[14]  Vaidy S. Sunderam,et al.  PVM: A Framework for Parallel Distributed Computing , 1990, Concurr. Pract. Exp..