Path Based Optimization of MPI Collective Communication Operation in Cloud

There has been a considerable research in collective communication operations, especially in MPI Broadcast and Gather operations, on message passing platforms. Majority of the research work is done, to improve efficiency of the collective communication operations for specific architectures by considering either their communication network or platform parameters. In this work, a simple and general approach to optimize legacy MPI collective communication algorithms for cloud environment, is proposed. Cloud Computing is one of the prominent technologies as a new platform for distributed, large scale applications. Cloud services reduce investment in hardware cost and make application development and deployment faster. Because of the use of hardware virtualization, it is flexible and elastic in terms of resource provisioning. This virtualization hides the network topology information. Scalable parallel programs can be efficiently and effectively expressed using MPI library, which is powerful and portable. System performance is mostly affected by its collective communication operations. The use of hardware virtualization, makes topology aware communication algorithm ineffective. Thus the main focus in this paper is on improving the efficiency of MPI collective communication operations, especially on MPI Broadcast, Gather and exploring cost optimizations for MPI. Therefore, a new approach to improve broadcast and gather operation based on network performance matrices, is developed. The proposed approach is tested on LAN environment and a HPC cluster. The experimental results are improved compared to the existing MPICH2 library.

[1]  Jianlong Zhong,et al.  Network Performance Aware MPI Collective Communication Operations in the Cloud , 2015, IEEE Transactions on Parallel and Distributed Systems.

[2]  James C. Hoe,et al.  MPI-StarT: Delivering Network Performance to Numerical Applications , 1998, Proceedings of the IEEE/ACM SC98 Conference.

[3]  J. Watts,et al.  Interprocessor collective communication library (InterCom) , 1994, Proceedings of IEEE Scalable High Performance Computing Conference.

[4]  Robert A. van de Geijn,et al.  A Pipelined Broadcast for Multidimensional Meshes , 1995, Parallel Process. Lett..

[5]  Sayantan Sur,et al.  Design and Evaluation of Network Topology-/Speed- Aware Broadcast Algorithms for InfiniBand Clusters , 2011, 2011 IEEE International Conference on Cluster Computing.

[6]  Bronis R. de Supinski,et al.  Exploiting hierarchy in parallel computer networks to optimize collective operation performance , 2000, Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000.

[7]  George Bosilca,et al.  Open MPI: Goals, Concept, and Design of a Next Generation MPI Implementation , 2004, PVM/MPI.

[8]  Henri E. Bal,et al.  MagPIe: MPI's collective communication operations for clustered wide area systems , 1999, PPoPP '99.

[9]  James Demmel,et al.  Improving communication performance in dense linear algebra via topology aware collectives , 2011, 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[10]  Dhabaleswar K. Panda,et al.  Designing topology-aware collective communication algorithms for large scale InfiniBand clusters: Case studies with Scatter and Gather , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW).

[11]  Ian T. Foster,et al.  A Grid-Enabled MPI: Message Passing in Heterogeneous Distributed Computing Systems , 1998, Proceedings of the IEEE/ACM SC98 Conference.