Run-Time Partitioning of Scientific Continuum Calculations Running

A wide range of scientific continuum calculations typically concentrate computational effort non-uniformly over localized regions of physical space. We present a run-time partitioning strategy, intended for such methods, that distributes work evenly across a team of processors and that can exploit the spatial localization present in the original computation in order to avoid high overhead costs. We tried out our strategy on Anderson''s Methods of Local Corrections, a type of vortex method for computational fluid dynamics. Because computational effort follows particles that congregate and disperse irregularly about the domain, this problem is hard to partition in a way that distributes the work evenly among the processors. We ran experiments on 32 processors of an Intel Personal Scientific Computer-- a message-passing hypercube multiprocessor--and on 4 processors of a Cray X-MP--a shared-memory vector architecture--and achieved good parallel speedups of 22 and 3.6, respectively. The partitioner may be implemented as a virtual machine (VM) and made available to the programmer as a library of run-time utilities. The semantics of the VM are insensitive to the application and to the computer architecture on which the VM are insensitive to the application and to the computer architecture on which the VM is implemented. The VM works with ordinary programming languages, incurs modest overhead costs, and requires no special hardware support. It should apply to diverse applications, including finite difference methods, and to diverse architectures without requiring that the application be reprogrammed extensively for each new architecture.