Architecture-independent locality-improving transformations of computational graphs embedded in k-dimensions

A large number of data-parallel applications can be represented as computational graphs from the perspective of parallel computing. The nodes of these graphs represent tasks that can be executed concurrently, while the edges represent the interactions between them. Further, the computational graphs derived from many applications are such that the vertices correspond to multi-dimensional coordinates, and the interaction between computations is limited to vertices that are physically proximate. In this paper we show that graphs with these properties can be transformed into simple architecture-independent representations that encapsulate the locality in these graphs. This representation allows a fast mapping of the computational graph onto the underlying architecture at the time of execution. This is necessary for environments where available computational resources can fre determined only at the time of execution or that change during execution.

[1]  G. C. Fox,et al.  Load balancing loosely synchronous problems with a neural network , 1988, C3P.

[2]  Fox,et al.  Load balancing and sparse matrix vector multiplication on the hypercube , 1986 .

[3]  Sanjay Ranka,et al.  Parallel incremental graph partitioning using linear programming , 1994, Proceedings of Supercomputing '94.

[4]  B. Mohar,et al.  Eigenvalues in Combinatorial Optimization , 1993 .

[5]  G. C. Fox,et al.  Solving Problems on Concurrent Processors , 1988 .

[6]  Bruce Hendrickson,et al.  A Multi-Level Algorithm For Partitioning Graphs , 1995, Proceedings of the IEEE/ACM SC95 Conference.

[7]  Kishan G. Mehrotra,et al.  Genetic algorithms for graph partitioning and incremental graph partitioning , 1994, Proceedings of Supercomputing '94.

[8]  V. Klee,et al.  Combinatorial and graph-theoretical problems in linear algebra , 1993 .

[9]  Alex Pothen,et al.  PARTITIONING SPARSE MATRICES WITH EIGENVECTORS OF GRAPHS* , 1990 .

[10]  Jack A. Orenstein Spatial query processing in an object-oriented database system , 1986, SIGMOD '86.

[11]  Geoffrey C. Fox,et al.  Fast Mapping And Remapping Algorithms For Irregular And Adaptive Problems , 1993 .

[12]  Nashat Mansour,et al.  Physical optimization algorithms for mapping data to distributed-memory multiprocessors , 1992 .

[13]  Horst D. Simon,et al.  Fast multilevel implementation of recursive spectral bisection for partitioning unstructured problems , 1994, Concurr. Pract. Exp..

[14]  George Karypis,et al.  Introduction to Parallel Computing , 1994 .

[15]  Bruce Hendrickson,et al.  The Chaco user`s guide. Version 1.0 , 1993 .

[16]  T. H. Merrett,et al.  A class of data structures for associative searching , 1984, PODS.

[17]  Brian W. Kernighan,et al.  An efficient heuristic procedure for partitioning graphs , 1970, Bell Syst. Tech. J..

[18]  Bruce Hendrickson,et al.  An Improved Spectral Graph Partitioning Algorithm for Mapping Parallel Computations , 1995, SIAM J. Sci. Comput..

[19]  Cecilia R. Aragon,et al.  Optimization by Simulated Annealing: An Experimental Evaluation; Part I, Graph Partitioning , 1989, Oper. Res..

[20]  Sanjay Ranka,et al.  Parallel remapping algorithms for adaptive problems , 1995, Proceedings Frontiers '95. The Fifth Symposium on the Frontiers of Massively Parallel Computation.

[21]  D. Hilbert Über die stetige Abbildung einer Linie auf ein Flächenstück , 1935 .

[22]  M. Karplus,et al.  CHARMM: A program for macromolecular energy, minimization, and dynamics calculations , 1983 .

[23]  B. Nour-Omid,et al.  Solving finite element equations on concurrent computers , 1987 .

[24]  Bruce Hendrickson,et al.  An Improved Spectral Load Balancing Method , 1993, PPSC.

[25]  Geoffrey C. Fox,et al.  Solving problems on concurrent processors: vol. 2 , 1990 .

[26]  Horst D. Simon,et al.  Partitioning of unstructured problems for parallel processing , 1991 .

[27]  D. Hilbert Ueber die stetige Abbildung einer Line auf ein Flächenstück , 1891 .

[28]  D WilliamsRoy Performance of dynamic load balancing algorithms for unstructured mesh calculations , 1991 .

[29]  Fikret Ercal,et al.  Heuristic approaches to task allocation for parallel computing , 1988 .

[30]  C. F. Baillie,et al.  Cluster Algorithms for Spin Models on MIMD Parallel Computers , 1990, Proceedings of the Fifth Distributed Memory Computing Conference, 1990..

[31]  Ken Kennedy,et al.  Software support for irregular and loosely synchronous problems , 1992 .