This paper presents solutions for the problem of many-to-many personalized communication, with bounded incoming and outgoing traffic, on a distributed memory parallel machine. We present a two-stage algorithm that decomposes the many-to-many communication with possibly high variance in message size into two communications with low message size variance. The algorithm is deterministic and takes time 2t/spl mu/(+lower order terms) when t/spl ges/0(p/sup 2/+p/spl tau///spl mu/) Here t is the maximum outgoing or incoming traffic at any processor, /spl tau/ is the startup overhead and /spl mu/ is the inverse of the data transfer rate. Optimality is achieved when the traffic is large, a condition that is usually satisfied in practice on coarse-grained architectures. The algorithm was implemented on the Connection Machine CM-5. The implementation used the low latency communication primitives (active messages) available on the CM-5, but the algorithm as such is architecture-independent. An alternate single-stage algorithm using distributed random scheduling for the CM-5 was implemented and the performance of the two algorithms were compared.<<ETX>>
[1]
Seth Copen Goldstein,et al.
Active messages: a mechanism for integrating communication and computation
,
1998,
ISCA '98.
[2]
Shahid H. Bokhari,et al.
Complete exchange on the iPSC-860
,
1991
.
[3]
Eric A. Brewer,et al.
How to get good performance from the CM-5 data network
,
1994,
Proceedings of 8th International Parallel Processing Symposium.
[4]
Sanjay Ranka,et al.
The Transportation Primitive
,
1994
.
[5]
Viktor K. Prasanna,et al.
Scalable data parallel object recognition using geometric hashing on CM-5
,
1994,
Proceedings of IEEE Scalable High Performance Computing Conference.
[6]
Sanjay Ranka,et al.
Random Data Accesses on a Coarse-Grained Parallel Machine I: One-to-One Mappings
,
1997,
J. Parallel Distributed Comput..
[7]
Sanjay Ranka,et al.
Distributed Scheduling of Unstructured Collective Communication on the CM-5
,
1994,
Parallel Process. Lett..
[8]
George Karypis,et al.
Introduction to Parallel Computing
,
1994
.