Deterministic simulation of shared memory on bounded degree networks

The Parallel Random Access Machine (PRAM) is an abstract parallel machine consisting of a synchronous collection of n processors connected to a shared memory of m cells. The essential feature of the PRAM is that the processors can access any n-tuple of distinct cells in a single machine cycle. While the PRAM is an attractive and widely used framework for the design and analysis of parallel algorithms, it does not reflect the constraints of realistic multiprocessors. This thesis explores the problem of efficient deterministic simulations of PRAM computations on bounded degree networks of processors, a model of parallel machines closer to what can be built in practice. It is shown that an arbitrary step of a PRAM with n processors and m $\geq$ n cells of shared memory can be simulated in O(log(m/n)log n/log log n + log n log log n(log log(m/n) $-$ log log log n)) time in the worst-case on an n-node bounded degree network with a particular expander-based structure. This simulation is more efficient than all deterministic simulations previously known both with respect to time and space. In the case where m/n is polylogarithmic in n, the worst-case time to simulate a single PRAM step is at most O(log n log log n) which is within a factor of O(log log n) the diameter of the network. The space requirements for our algorithm are at most O(m(log(m/n))$\sp{3}$) overall. The simulation may also be adapted to run on to an n-processor augmented mesh-of-trees architecture with a running time of O(log nlog log n(log log(m/n) $-$ log log log n) + log(m/n)). Overall these results suggest that, in principle at least, it is feasible to provide the abstraction of a shared memory on distributed models of parallel computation with only modest degradation in performance in the worst case.