Constructive, Deterministic Implementation of Shared Memory on Meshes

This paper describes a scheme to implement a shared address space of size m on an n-node mesh, with m polynomial in n, where each mesh node hosts a processor and a memory module. At the core of the simulation is a hierarchical memory organization scheme (HMOS), which governs the distribution of the shared variables, each replicated into multiple copies, among the memory modules, through a cascade of bipartite graphs. Based on the expansion properties of such graphs, we devise a protocol that accesses any n-tuple of shared variables in worst-case time $O(n^{1/2+\eta})$, for any constant $\eta > 0$, using $O(1/\eta^{1.59})$ copies per variable, or in worst-case time O(n1/2 log n), using O(log1.59 n) copies per variable. In both cases the access time is close to the natural $O(\sqrt{n})$ lower bound imposed by the network diameter. A key feature of the scheme is that it can be made fully constructive when m is not too large, thus providing in this case the first efficient, constructive, deterministic scheme in the literature for bounded-degree processor networks. For larger memory sizes, the scheme relies solely on a nonconstructive graph of weak expansion. Finally, the scheme can be efficiently ported to other architectures, as long as they exhibit certain structural properties. In the paper we discuss the porting to multidimensional meshes and to the pruned butterfly, an area-universal network which is a variant of the fat-tree.

[1]  Eli Upfal,et al.  How to share memory in a distributed system , 1984, JACM.

[2]  Michael Kaufmann,et al.  Deterministic 1-k Routing on Meshes , 1994, STACS.

[3]  J. A. Salvato John wiley & sons. , 1994, Environmental science & technology.

[4]  Friedhelm Meyer auf der Heide,et al.  Shared Memory Simulations with Triple-Logarithmic Delay , 1995, ESA.

[5]  Gianfranco Bilardi,et al.  Deterministic Simulations of PRAMs on Bounded Degree Networks , 1994, SIAM J. Comput..

[6]  Geppino Pucci,et al.  Constructive deterministic PRAM simulation on a mesh-connected computer , 1994, SPAA '94.

[7]  Bruce M. Maggs,et al.  Randomized Routing and Sorting on Fixed-Connection Networks , 1994, J. Algorithms.

[8]  Andrea Pietracaprina,et al.  Practical constructive schemes for deterministic shared-memory access , 2007, Theory of Computing Systems.

[9]  G. Bilardi,et al.  Deterministic on-line routing on area-universal networks , 1990, Proceedings [1990] 31st Annual Symposium on Foundations of Computer Science.

[10]  Kurt Mehlhorn,et al.  Deterministic Simulation of Idealized Parallel Computers on More Realistic Ones , 1986, SIAM J. Comput..

[11]  Geppino Pucci,et al.  Improved Deterministic PRAM Simulation on the Mesh , 1995, ICALP.

[12]  Manfred Kunde Block Gossiping on Grids and Tori: Deterministic Sorting and Routing Match the Bisection Bound , 1993, ESA.

[13]  Abhiram G. Ranade,et al.  How to emulate shared memory , 1991, 28th Annual Symposium on Foundations of Computer Science (sfcs 1987).

[14]  F. Leighton,et al.  Introduction to Parallel Algorithms and Architectures: Arrays, Trees, Hypercubes , 1991 .

[15]  Charles E. Leiserson,et al.  Fat-trees: Universal networks for hardware-efficient supercomputing , 1985, IEEE Transactions on Computers.