论文信息 - Improved MPI All-to-all Communication on a Giganet SMP Cluster

Improved MPI All-to-all Communication on a Giganet SMP Cluster

We present the implementation of an improved, almost optimal algorithm for regular, personalized all-to-all communication for hierarchical multiprocessors, like clusters of SMP nodes. In MPI this communication primitive is realized in the MPI_Alltoall collective. The algorithm is a natural generalization of a well-known algorithm for nonhierarchical systems based on factorization. A specific contribution of the paper is a completely contention-free scheme not using token-passing for exchange of messages between SMP nodes.We describe a dedicated implementation for a small Giganet SMP cluster with 6 SMP nodes of 4 processors each. We present simple experiments to validate the assumptions underlying the design of the algorithm. The results were used to guide the detailed implementation of a crucial part of the algorithm. Finally, we compare the improved MPI_Alltoall collective to a trivial (but widely used) implementation, and show improvements in average completion time of sometimes more than 10%. While this may not seem much, we have reasons to believe that the improvements will be more substantial for larger systems.

Jesper Larsson Träff | J. Träff

[1] William Gropp,et al. Mpi - The Complete Reference: Volume 2, the Mpi Extensions , 1998 .

[2] Jesper Larsson Träff,et al. The Hierarchical Factor Algorithm for All-to-All Communication (Research Note) , 2002, Euro-Par.

[3] Henri E. Bal,et al. Bandwidth-efficient collective communication for clustered wide area systems , 2000, Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000.

[4] Roberto Solis-Oba,et al. How Helpers Hasten h-Relations , 2000, J. Algorithms.

[5] Jack Dongarra,et al. Recent Advances in Parallel Virtual Machine and Message Passing Interface, 15th European PVM/MPI Users' Group Meeting, Dublin, Ireland, September 7-10, 2008. Proceedings , 2008, PVM/MPI.

[6] Henri E. Bal,et al. MagPIe: MPI's collective communication operations for clustered wide area systems , 1999, PPoPP '99.

[7] Maciej Golebiewski,et al. MPI-2 One-Sided Communications on a Giganet SMP Cluster , 2001, PVM/MPI.

[8] Lars Paul Huse. MPI Optimization for SMP Based Clusters Interconnected with SCI , 2000, PVM/MPI.

[9] Susanne E. Hambrusch,et al. Communication Operations on Coarse-Grained Mesh Architectures , 1995, Parallel Comput..

[10] Bronis R. de Supinski,et al. Exploiting hierarchy in parallel computer networks to optimize collective operation performance , 2000, Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000.

[11] Henri E. Bal,et al. MPI's Reduction Operations in Clustered Wide Area Systems. , 1999 .

[12] Burkhard Monien,et al. Euro-Par 2002 Parallel Processing , 2002, Lecture Notes in Computer Science.

[13] Jack Dongarra,et al. MPI - The Complete Reference: Volume 1, The MPI Core , 1998 .

[14] Daniel Massey,et al. The Dance Party Problem and its Application to Collective Communication in Computer Networks , 1997, Parallel Comput..

[15] Frank Harary,et al. Graph Theory , 2016 .