distributed cooperation and adversity: complexity trade-offs

The problem of cooperatively performing a collection of tasks in a decentralized setting where the computing medium is subject to adversarial perturbations is one of the fundamental problems in distributed computing. Such perturbations can be caused by processor failures, unpredictable delays, and communication breakdowns.(i)~failure-sensitive bounds for distributed cooperation problems for synchronous processors subject to crash failures.These research results are motivated by the earlier work of the third author with Paris C. Kanellakis at Brown University.

[1]  Alexander Russell,et al.  Distributed scheduling for disconnected cooperation , 2005, Distributed Computing.

[2]  David Powell,et al.  Group communication , 1996, CACM.

[3]  Charles U. Martel,et al.  Work-Optimal Asynchronous Algorithms for Shared Memory Parallel Computers , 1992, SIAM J. Comput..

[4]  S. Griffis EDITOR , 1997, Journal of Navigation.

[5]  Alexander A. Shvartsman,et al.  Fault-Tolerant Parallel Computation , 1997 .

[6]  Amos Fiat,et al.  Competitive Paging Algorithms , 1991, J. Algorithms.

[7]  Alexander Russell,et al.  Distributed Cooperation During the Absence of Communication , 2000, DISC.

[8]  Alexander A. Shvartsman,et al.  Efficient parallel algorithms can be made robust , 1989, PODC '89.

[9]  Z. M. Kedem,et al.  Combining tentative and definite executions for very fast dependable parallel computing , 1991, STOC '91.

[10]  Alexander Russell,et al.  Optimally work-competitive scheduling for cooperative computing with merging groups , 2002, PODC '02.

[11]  R. Subramonian,et al.  Asynchronous PRAMs are (almost) as good as synchronous PRAMs , 1990, Proceedings [1990] 31st Annual Symposium on Foundations of Computer Science.

[12]  Jan Friso Groote,et al.  An algorithm for the asynchronous Write-All problem based on process collision , 2001, Distributed Computing.

[13]  Grzegorz Malewicz,et al.  A work-optimal deterministic algorithm for the asynchronous certified write-all problem , 2003, PODC '03.

[14]  Robert E. Tarjan,et al.  Amortized efficiency of list update and paging rules , 1985, CACM.

[15]  Alexander Russell,et al.  The Complexity of Synchronous Iterative Do-All with Crashes , 2001, DISC.

[16]  Bogdan S. Chlebus,et al.  Performing tasks on synchronous restartable message-passing processors , 2001, Distributed Computing.

[17]  Richard J. Anderson,et al.  Algorithms for the Certified Write-All Problem , 1997, SIAM J. Comput..

[18]  Dariusz R. Kowalski,et al.  Performing work with asynchronous processors: Message-delay-sensitive bounds , 2005, Inf. Comput..

[19]  Alexander Russell,et al.  Optimal scheduling for disconnected cooperation , 2001, PODC '01.

[20]  Sam Toueg,et al.  Fault-tolerant broadcasts and related problems , 1993 .

[21]  Dariusz R. Kowalski,et al.  Bounding Work and Communication in Robust Cooperative Computation , 2002, DISC.

[22]  Alexander Russell,et al.  Work-competitive scheduling for cooperative computing with dynamic groups , 2003, STOC '03.

[23]  Noga Alon,et al.  The Probabilistic Method , 2015, Fundamentals of Ramsey Theory.

[24]  Richard D. Schlichting,et al.  Fail-stop processors: an approach to designing fault-tolerant computing systems , 1983, TOCS.

[25]  Alexander A. Shvartsman,et al.  Controlling Memory Access Concurrency in Efficient Fault-Tolerant Parallel Algorithms , 1995, Nord. J. Comput..

[26]  Prabhakar Ragde,et al.  Parallel Algorithms with Processor Failures and Delays , 1996, J. Algorithms.

[27]  Alexander Russell,et al.  Local scheduling for distributed cooperation , 2001, Proceedings IEEE International Symposium on Network Computing and Applications. NCA 2001.

[28]  Z. M. Kedem,et al.  Combining tentative and definite executions for dependable parallel computing , 1990 .

[29]  Danny Dolev,et al.  The Transis approach to high availability cluster communication , 1996, CACM.

[30]  Allan Borodin,et al.  On the power of randomization in on-line algorithms , 2005, Algorithmica.

[31]  Charles U. Martel,et al.  On the Complexity of Certified Write-All Algorithms , 1994, J. Algorithms.

[32]  Bogdan S. Chlebus,et al.  Performing Tasks on Restartable Message-Passing Processors , 1997, WDAG.

[33]  V. Rich Personal communication , 1989, Nature.

[34]  Alexander A. Shvartsman,et al.  Fault-tolerant and efficient parallel computation , 1992 .

[35]  Mihalis Yannakakis,et al.  On the value of information in distributed decision-making (extended abstract) , 1991, PODC '91.

[36]  Alexander Russell,et al.  The Complexity of Distributed Cooperation in the Presence of Failures , 2000, OPODIS.

[37]  Paul G. Spirakis,et al.  Efficient robust parallel computations , 2018, STOC '90.

[38]  Frank Harary,et al.  Graph Theory , 2016 .

[39]  Alexander Russell,et al.  Distributed Computation Meets Design Theory: Local Scheduling for Disconnected Cooperation , 2002, Bull. EATCS.

[40]  Krishna V. Palem,et al.  Efficient program transformations for resilient parallel computation via randomization (preliminary version) , 1992, STOC '92.

[41]  Chryssis Georgiou,et al.  Cooperative computing with fragmentable and mergeable groups , 2003, J. Discrete Algorithms.

[42]  Y. Aumann,et al.  Clock construction in fully asynchronous parallel systems and PRAM simulation , 1992, Proceedings., 33rd Annual Symposium on Foundations of Computer Science.

[43]  Steven Fortune,et al.  Parallelism in random access machines , 1978, STOC.

[44]  Joseph Y. Halpern,et al.  Performing work efficiently in the presence of faults , 1992, PODC '92.

[45]  Richard M. Karp,et al.  A n^5/2 Algorithm for Maximum Matchings in Bipartite Graphs , 1971, SWAT.

[46]  Selmer M. Johnson A new upper bound for error-correcting codes , 1962, IRE Trans. Inf. Theory.

[47]  Moti Yung,et al.  Resolving message complexity of Byzantine Agreement and beyond , 1995, Proceedings of IEEE 36th Annual Foundations of Computer Science.

[48]  Alexander A. Shvartsman Achieving Optimal CRCW PRAM Fault-Tolerance , 1991, Inf. Process. Lett..

[49]  Paul G. Spirakis,et al.  Optimal, Distributed Decision-Making: The Case of No Communication , 1999, FCT.

[50]  Kenneth P. Birman,et al.  The process group approach to reliable distributed computing , 1992, CACM.

[51]  Moti Yung,et al.  Time-optimal message-efficient work performance in the presence of faults , 1994, PODC '94.