A work-optimal deterministic algorithm for the asynchronous certified write-all problem

In their SIAM J. on Computing paper [27] from 1992, Martel et al. posed a question for developing a work-optimal deterministic asynchronous algorithm for the fundamental load-balancing and synchronization problem called Certified Write-All. In this problem, introduced in a slightly different form by Kanellakis and Shvartsman in a PODC'89 paper [17], p processors must update n memory cells and then signal the completion of the updates. It is known that solutions to this problem can be used to simulate synchronous parallel programs on asynchronous systems with worst-case guarantees for the overhead of a simulation. Such simulations are interesting because they may increase productivity in parallel computing since synchronous parallel programs are easier to reason about than asynchronous ones are.This paper presents a solution to the question of Martel et al. Specifically, we show a deterministic asynchronous algorithm for the Certified Write-All problem. Our algorithm has O(n + p4 log n) work, which is optimal for a non-trivial number of processors p ≤ (n/log n)1/4. In contrast, all known deterministic algorithms require superlinear in n work when p = n1/r, for any fixed r ≥ 1. Our algorithm generalizes the collision principle used by the algorithm T that was introduced by Buss et al. [7].

[1]  Richard Cole,et al.  The APRAM: incorporating asynchrony into the PRAM model , 1989, SPAA '89.

[2]  Alexander A. Shvartsman,et al.  Efficient parallel algorithms can be made robust , 1989, PODC '89.

[3]  Richard J. Anderson,et al.  Algorithms for the Certified Write-All Problem , 1997, SIAM J. Comput..

[4]  Dariusz R. Kowalski,et al.  Towards practical deteministic write-all algorithms , 2001, SPAA '01.

[5]  Paul G. Spirakis,et al.  Efficient robust parallel computations , 2018, STOC '90.

[6]  Joseph Naor,et al.  Constructions of Permutation Arrays for Certain Scheduling Cost Measures , 1995, Random Struct. Algorithms.

[7]  Alexander A. Shvartsman Achieving Optimal CRCW PRAM Fault-Tolerance , 1991, Inf. Process. Lett..

[8]  Barton P. Miller,et al.  On the Complexity of Event Ordering for Shared-Memory Parallel Program Executions , 1990, ICPP.

[9]  Z. M. Kedem,et al.  Combining tentative and definite executions for very fast dependable parallel computing , 1991, STOC '91.

[10]  Charles U. Martel,et al.  Asynchronous PRAM Algorithms for List Ranking and Transitive Closure , 1990, ICPP.

[11]  Yonatan Aumann,et al.  Highly efficient asynchronous execution of large-grained parallel programs , 1993, Proceedings of 1993 IEEE 34th Annual Foundations of Computer Science.

[12]  Friedhelm Meyer auf der Heide,et al.  Efficient PRAM simulation on a distributed memory machine , 1992, STOC '92.

[13]  Maurice Herlihy,et al.  Wait-free synchronization , 1991, TOPL.

[14]  Richard Cole,et al.  The expected advantage of asynchrony , 1990, SPAA '90.

[15]  Jan Friso Groote,et al.  An algorithm for the asynchronous Write-All problem based on process collision , 2001, Distributed Computing.

[16]  David Thomas,et al.  The Art in Computer Programming , 2001 .

[17]  Charles U. Martel,et al.  Work-Optimal Asynchronous Algorithms for Shared Memory Parallel Computers , 1992, SIAM J. Comput..

[18]  Yonatan Aumann,et al.  Clock Construction in Fully Asynchronous Parallel Systems and PRAM Simulation , 1994, Theor. Comput. Sci..

[19]  Piotr Indyk,et al.  PRAM Computations Resilient to Memory Faults , 1994, ESA.

[20]  Donald E. Knuth,et al.  The art of computer programming, volume 3: (2nd ed.) sorting and searching , 1998 .

[21]  Prabhakar Ragde,et al.  Parallel Algorithms with Processor Failures and Delays , 1996, J. Algorithms.

[22]  Ramesh Subramonian,et al.  Designing synchronous algorithms for asynchronous processors , 1992, SPAA '92.

[23]  Friedhelm Meyer auf der Heide,et al.  Efficient PRAM simulation on a distributed memory machine , 1992, STOC '92.

[24]  Charles U. Martel,et al.  On the Complexity of Certified Write-All Algorithms , 1994, J. Algorithms.

[25]  Krishna V. Palem,et al.  Efficient program transformations for resilient parallel computation via randomization (preliminary version) , 1992, STOC '92.

[26]  Alexander Russell,et al.  Distributed scheduling for disconnected cooperation , 2005, Distributed Computing.

[27]  L. Lovász Combinatorial problems and exercises , 1979 .

[28]  Michael A. Bender,et al.  Efficient execution of nondeterministic parallel programs on asynchronous systems , 1996, SPAA '96.

[29]  Partha Dasgupta,et al.  Parallel processing on networks of workstations: a fault-tolerant, high performance approach , 1995, Proceedings of 15th International Conference on Distributed Computing Systems.

[30]  Partha Dasgupta,et al.  CALYPSO: a novel software system for fault-tolerant parallel processing on distributed platforms , 1995, Proceedings of the Fourth IEEE International Symposium on High Performance Distributed Computing.

[31]  Paul G. Spirakis,et al.  Tentative and Definite Distributed Computations: An Optimistic Approach to Network Synchronization , 1992, WDAG.

[32]  Naomi Nishimura,et al.  Asynchronous shared memory parallel computation , 1990, SPAA '90.

[33]  Larry Rudolph,et al.  A Complexity Theory of Efficient Parallel Algorithms , 1990, Theor. Comput. Sci..

[34]  Alexander A. Shvartsman,et al.  Fault-Tolerant Parallel Computation , 1997 .

[35]  Phillip B. Gibbons A more practical PRAM model , 1989, SPAA '89.