Runtime support for parallelization of data-parallel applications on adaptive and nonuniform computational environments

In this paper, we discuss the runtime support required for the parallelization of unstructured data-parallel applications on nonuniform and adaptive environments. The approach presented is reasonably general and is applicable to a wide variety of regular as well as irregular applications. We present performance results for the solution of an unstructured mesh on a cluster of heterogeneous workstations.

[1]  Sanjay Ranka,et al.  Partitioning unstructured computational graphs for nonunifor , 1995, IEEE Parallel & Distributed Technology: Systems & Applications.

[2]  Sanjay Ranka,et al.  Architecture-independent locality-improving transformations of computational graphs embedded in k-dimensions , 1995, ICS '95.

[3]  Horst D. Simon,et al.  Partitioning of unstructured problems for parallel processing , 1991 .

[4]  Geoffrey C. Fox,et al.  Fast Mapping And Remapping Algorithms For Irregular And Adaptive Problems , 1993 .

[5]  Nashat Mansour,et al.  Physical optimization algorithms for mapping data to distributed-memory multiprocessors , 1992 .

[6]  A. Choudhary,et al.  Software support for irregular and loosely synchronous problems , 1992 .

[7]  Lawrence Snyder,et al.  An Algorithm Producing Balanced Partitionings of Data Arrays , 1990, Proceedings of the Fifth Distributed Memory Computing Conference, 1990..

[8]  Johan De Keyser,et al.  Run-Time Load Balancing Support for a Parallel Multiblock Euler/Navier-Stokes Code with Adaptive Refinement on Distributed Memory Computers , 1994, Parallel Comput..

[9]  G. C. Fox,et al.  Solving Problems on Concurrent Processors , 1988 .

[10]  Stephen E. Deering,et al.  Multicast routing in datagram internetworks and extended LANs , 1990, TOCS.

[11]  Francine Berman,et al.  Program Speedup in a Heterogeneous Computing Network , 1994, J. Parallel Distributed Comput..

[12]  Fox,et al.  Load balancing and sparse matrix vector multiplication on the hypercube , 1986 .

[13]  Joel H. Saltz,et al.  Communication Optimizations for Irregular Scientific Computations on Distributed Memory Architectures , 1994, J. Parallel Distributed Comput..

[14]  G. C. Fox,et al.  Load balancing loosely synchronous problems with a neural network , 1988, C3P.

[15]  Rice UniversityCORPORATE,et al.  High performance Fortran language specification , 1993 .

[16]  Joel H. Saltz,et al.  Dynamic Remapping of Parallel Computations with Varying Resource Demands , 1988, IEEE Trans. Computers.

[17]  Brian K. Schmidt,et al.  Empirical analysis of overheads in cluster environments , 1994, Concurr. Pract. Exp..

[18]  Sanjay Ranka,et al.  Mapping Unstructured Computational Graphs for Adaptive and Nonuniform Computational Environments , 1995 .

[19]  Bruce Hendrickson,et al.  An Improved Spectral Load Balancing Method , 1993, PPSC.

[20]  Peter Steenkiste,et al.  Automatic generation of parallel programs with dynamic load balancing , 1994, Proceedings of 3rd IEEE International Symposium on High Performance Distributed Computing.

[21]  Fikret Ercal,et al.  Heuristic approaches to task allocation for parallel computing , 1988 .

[22]  Kishan G. Mehrotra,et al.  Genetic algorithms for graph partitioning and incremental graph partitioning , 1994, Proceedings of Supercomputing '94.

[23]  Michael J. Quinn,et al.  Data-parallel programming on a network of heterogeneous workstations , 1993, Concurr. Pract. Exp..

[24]  Harry Berryman,et al.  Execution time support for adaptive scientific algorithms on distributed memory machines , 1991, Concurr. Pract. Exp..

[25]  Alex Pothen,et al.  PARTITIONING SPARSE MATRICES WITH EIGENVECTORS OF GRAPHS* , 1990 .

[26]  Philip K. McKinley,et al.  Communication issues in parallel computing across ATM networks , 1994, IEEE Parallel & Distributed Technology: Systems & Applications.