Towards an Efficient Process Placement Policy for MPI Applications in Multicore Environments

This paper presents a method to efficiently place MPI processes on multicore machines. Since MPI implementations often feature efficient supports for both shared-memory and network communication, an adequate placement policy is a crucial step to improve applications performance. As a case study, we show the results obtained for several NAS computing kernels and explain how the policy influences overall performance. In particular, we found out that a policy merely increasing the intranode communication ratio is not enough and that cache utilization is also an influential factor. A more sophisticated policy (eg. one taking into account the architecture's memory structure) is required to observe performance improvements.

[1]  F. Pellegrini,et al.  Static mapping by dual recursive bipartitioning of process architecture graphs , 1994, Proceedings of IEEE Scalable High Performance Computing Conference.

[2]  Jean Roman,et al.  SCOTCH: A Software Package for Static Mapping by Dual Recursive Bipartitioning of Process and Architecture Graphs , 1996, HPCN Europe.

[3]  Franck Cappello,et al.  Grid'5000: A Large Scale And Highly Reconfigurable Experimental Grid Testbed , 2006, Int. J. High Perform. Comput. Appl..

[4]  Jesper Larsson Träff Implementing the MPI process topology mechanism , 2002, SC '02.

[5]  Guillaume Mercier,et al.  Implementation and evaluation of shared-memory communication and synchronization operations in MPICH2 using the Nemesis communication subsystem , 2007, Parallel Comput..

[6]  Jack Dongarra,et al.  Recent Advances in Parallel Virtual Machine and Message Passing Interface, 15th European PVM/MPI Users' Group Meeting, Dublin, Ireland, September 7-10, 2008. Proceedings , 2008, PVM/MPI.

[7]  Samuel Thibault,et al.  Building Portable Thread Schedulers for Hierarchical Multiprocessors: The BubbleSched Framework , 2007, Euro-Par.

[8]  George Bosilca,et al.  Open MPI: Goals, Concept, and Design of a Next Generation MPI Implementation , 2004, PVM/MPI.

[9]  Alex Rapaport,et al.  Mpi-2: extensions to the message-passing interface , 1997 .

[10]  Dafydd Gibbon,et al.  1 User’s guide , 1998 .

[11]  Georg Hager,et al.  Hybrid MPI/OpenMP Parallel Programming on Clusters of Multi-Core SMP Nodes , 2009, 2009 17th Euromicro International Conference on Parallel, Distributed and Network-based Processing.