Heuristic-Based Techniques for Mapping Irregular Communication Graphs to Mesh Topologies

Mapping of parallel applications on the network topology is becoming increasingly important on large supercomputers. Topology aware mapping can reduce the hops traveled by messages on the network and hence reduce contention, which can lead to improved performance. This paper discusses heuristic techniques for mapping applications with irregular communication graphs to mesh and torus topologies. Parallel codes with irregular communication constitute an important class of applications. Unstructured grid applications are a classic example of codes with irregular communication patterns. Since the mapping problem is NP-hard, this paper presents fast heuristic-based algorithms. These heuristics are part of a larger framework for automatic mapping of parallel applications. We evaluate the heuristics in this paper in terms of the reduction in average hops per byte. The heuristics discussed here are applicable to most parallel applications since irregular graphs constitute the most general category of communication patterns. Some heuristics can also be easily extended to other network topologies.

[1]  Satoru Kawai,et al.  An Algorithm for Drawing General Undirected Graphs , 1989, Inf. Process. Lett..

[2]  Vipin Kumar,et al.  Parallel Multilevel k-way Partitioning Scheme for Irregular Graphs , 1996, Proceedings of the 1996 ACM/IEEE Conference on Supercomputing.

[3]  José E. Moreira,et al.  Topology Mapping for Blue Gene/L Supercomputer , 2006, ACM/IEEE SC 2006 Conference (SC'06).

[4]  Edward M. Reingold,et al.  Graph drawing by force‐directed placement , 1991, Softw. Pract. Exp..

[5]  Laxmikant V. Kalé,et al.  Automated mapping of regular communication graphs on mesh interconnects , 2010, 2010 International Conference on High Performance Computing.

[6]  S. Arunkumar,et al.  Randomized Heuristics for the Mapping Problem , 1992, Int. J. High Speed Comput..

[7]  Scott F. Midkiff,et al.  Processor and Link Assignment in Multicomputers Using Simulated Annealing , 1988, ICPP.

[8]  Laxmikant V. Kale,et al.  Automating Topology Aware Mapping for Supercomputers , 2010 .

[9]  Philip Heidelberger,et al.  Optimizing task layout on the Blue Gene/L supercomputer , 2005, IBM J. Res. Dev..

[10]  Toshiyuki Shimizu,et al.  Tofu: A 6D Mesh/Torus Interconnect for Exascale Computers , 2009, Computer.

[11]  Laxmikant V. Kalé,et al.  Topology-aware task mapping for reducing communication contention on large parallel machines , 2006, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.

[12]  Franz Franchetti,et al.  Large-scale electronic structure calculations of high-Z metals on the BlueGene/L platform , 2006, SC.

[13]  Shahid H. Bokhari,et al.  On the Mapping Problem , 1981, IEEE Transactions on Computers.

[14]  J. Ramanujam,et al.  Task allocation onto a hypercube by recursive mincut bipartitioning , 1990, C3P.

[15]  Emden R. Gansner,et al.  An open graph visualization system and its applications to software engineering , 2000 .

[16]  Jake K. Aggarwal,et al.  A Mapping Strategy for Parallel Processing , 1987, IEEE Transactions on Computers.

[17]  Torsten Hoefler,et al.  Generic topology mapping strategies for large-scale parallel architectures , 2011, ICS '11.

[18]  Laxmikant V. Kalé,et al.  Optimizing communication for Charm++ applications by reducing network contention , 2011, Concurr. Comput. Pract. Exp..

[19]  Hironori Kasahara,et al.  Practical Multiprocessor Scheduling Algorithms for Efficient Parallel Processing , 1984, IEEE Transactions on Computers.

[20]  François Pellegrini,et al.  Improvement of the Efficiency of Genetic Algorithms for Scalable Parallel Graph Partitioning in a Multi-level Framework , 2006, Euro-Par.