POM:A Process Optimization Mapping Tool for MPI Programs

Modern supercomputers contain more computing nodes with many multi-core processors in one node.Inter-node and intra-node hvae different bandwidth,and make up two different communication layers,the intra-node layer's communication performance is better.The default process mapping of MPI do not consider the difference of bandwidth,so it decreases the performance of the computing platform.To resolve the problem,this paper introduces an automatic tool of optimizing process mapping for MPI programs,which supplies a low cost method of getting the communication information and optimizes the distribution of the communication of the system.So we can leverage the communication performance of the platform,and also better the performance of the program.First,to present the communication layer of the computing platform,supercomputer was simplified into two layers.The top is different computing nodes connected by high speed networks,the base is the multi-core processors on the same node,which has wider bandwidth.Second,we introduce a method to transform the collective communication into point-to-point communication and add it to the communication information.In the last,using undirected graph with edges of different weights to present the processes' communication relationship.So the process mapping problem now is a graph partitioning problem.This paper uses the open source software Chaco to solve the graph partitioning problem.The experiment proves that the POM can efficiently better the performance of MPI programs.