On the Effectiveness of D-BSP as a Bridging Model of Parallel Computation

This paper surveys and places into perspective a number of results concerning the D-BSP (Decomposable Bulk Synchronous Parallel) model of computation, a variant of the popular BSP model proposed byValiant in the early nineties. D-BSP captures part of the proximity structure of the computing platform, modeling it by suitable decompositions into clusters, each characterized by its own bandwidth and latency parameters. Quantitative evidence is provided that, when modeling realistic parallel architectures, D-BSP achieves higher effectiveness and portability than BSP, without significantly affecting the ease of use. It is also shown that D-BSP avoids some of the shortcomings of BSP which motivated the definition of other variants of the model. Finally, the paper discusses how the aspects of network proximity incorporated in the model allow for a better management of network congestion and bank contention, when supporting a shared-memory abstraction in a distributed-memory environment.

[1]  Michael T. Goodrich,et al.  Communication-efficient parallel sorting (preliminary version) , 1996, STOC '96.

[2]  Larry Carter,et al.  Universal Classes of Hash Functions , 1979, J. Comput. Syst. Sci..

[3]  Joseph JáJá,et al.  An Introduction to Parallel Algorithms , 1992 .

[4]  Eli Upfal,et al.  How to share memory in a distributed system , 1984, JACM.

[5]  Guy E. Blelloch,et al.  Accounting for memory bank contention and delay in high-bandwidth multiprocessors , 1995, SPAA '95.

[6]  Friedhelm Meyer auf der Heide,et al.  Truly Efficient Parallel Algorithms: c-Optimal Multisearch for an Extension of the BSP Model (Extended Abstract) , 1995, ESA.

[7]  Geppino Pucci,et al.  Implementing shared memory on clustered machines , 2001, Proceedings 15th International Parallel and Distributed Processing Symposium. IPDPS 2001.

[8]  Ben H. H. Juurlink,et al.  A quantitative comparison of parallel computation models , 1996, SPAA '96.

[9]  Clyde P. Kruskal,et al.  Submachine Locality in the Bulk Synchronous Setting (Extended Abstract) , 1996, Euro-Par, Vol. II.

[10]  Friedhelm Meyer auf der Heide,et al.  Truly Efficient Parallel Algorithms: 1-optimal Multisearch for an Extension of the BSP Model , 1998, Theor. Comput. Sci..

[11]  Bruce M. Maggs,et al.  Fast Algorithms for Finding O(Congestion + Dilation) Packet Routing Schedules , 1995, Proceedings of the Twenty-Eighth Annual Hawaii International Conference on System Sciences.

[12]  Ben H. H. Juurlink,et al.  The E-BSP Model: Incorporating General Locality and Unbalanced Communication into the BSP Model , 1996, Euro-Par, Vol. II.

[13]  F. Thomson Leighton,et al.  ARRAYS AND TREES , 1992 .

[14]  Paul G. Spirakis,et al.  BSP vs LogP , 1996, SPAA '96.

[15]  Michael Kaufmann,et al.  Deterministic 1-k Routing on Meshes , 1994, STACS.

[16]  Friedhelm Meyer auf der Heide,et al.  Shared Memory Simulations with Triple-Logarithmic Delay , 1995, ESA.

[17]  Geppino Pucci,et al.  A Quantitative Measure of Portability with Application to Bandwidth-Latency Models for Parallel Computing , 1999, Euro-Par.

[18]  F. Leighton,et al.  Introduction to Parallel Algorithms and Architectures: Arrays, Trees, Hypercubes , 1991 .

[19]  Leslie G. Valiant,et al.  A bridging model for parallel computation , 1990, CACM.

[20]  Frank Thomson Leighton,et al.  Multicommodity max-flow min-cut theorems and their use in designing approximation algorithms , 1999, JACM.

[21]  S. Sitharama Iyengar,et al.  Introduction to parallel algorithms , 1998, Wiley series on parallel and distributed computing.

[22]  Michael T. Goodrich,et al.  Communication-Efficient Parallel Sorting , 1999, SIAM J. Comput..

[23]  Russ Bubley,et al.  Randomized algorithms , 1995, CSUR.

[24]  Geppino Pucci,et al.  On stalling in LogP , 2000, J. Parallel Distributed Comput..

[25]  Ramesh Subramonian,et al.  LogP: a practical model of parallel computation , 1996, CACM.

[26]  Geppino Pucci,et al.  Constructive, Deterministic Implementation of Shared Memory on Meshes , 2000, SIAM J. Comput..

[27]  Abhiram G. Ranade,et al.  How to emulate shared memory , 1991, 28th Annual Symposium on Foundations of Computer Science (sfcs 1987).