PLinda 2.0: a transactional/checkpointing approach to fault tolerant Linda

Robust parallel computation in Linda requires both tuple space and processes to be resilient to failure. In this paper, we present PLinda 2.0, set of extensions to Linda to support robust parallel computation on loosely coupled processors communicating over a network. The principal extensions of PLinda 2.0 to Linda are transaction mechanisms for reliable tuple space and process-private logging mechanisms for resilient processes. The transaction mechanisms support two kinds of tuple space: stable tuple space always guaranteed to reflect state as of last committed transaction, and unstable tuple space protected by a transaction-consistent checkpoint. The process-private logging mechanisms are provided as tools for a process checkpointing scheme. These mechanisms allow the customization of checkpointing and recovery operations in each process to achieve low runtime overhead.<<ETX>>

[1]  David Kaminsky Adaptive parallelism with Piranha , 1995 .

[2]  Henri E. Bal,et al.  Programming languages for distributed computing systems , 1989, CSUR.

[3]  Shigeru Chiba,et al.  Exploiting a weak consistency to implement distributed tuple space , 1992, [1992] Proceedings of the 12th International Conference on Distributed Computing Systems.

[4]  Nicholas Carriero,et al.  How to write parallel programs - a first course , 1990 .

[5]  Richard D. Schlichting,et al.  Tolerating failures in the bag-of-tasks programming paradigm , 1991, [1991] Digest of Papers. Fault-Tolerant Computing: The Twenty-First International Symposium.

[6]  Barbara Liskov,et al.  A design for a fault-tolerant, distributed implementation of Linda , 1989, [1989] The Nineteenth International Symposium on Fault-Tolerant Computing. Digest of Papers.

[7]  Robert E. Strom,et al.  Optimistic recovery in distributed systems , 1985, TOCS.

[8]  Richard D. Schlichting,et al.  Supporting Fault-Tolerant Parallel Programming in Linda , 1995, IEEE Trans. Parallel Distributed Syst..

[9]  Andreas Reuter,et al.  Transaction Processing: Concepts and Techniques , 1992 .

[10]  Robert D. Bjornson Linda on distributed memory multiprocessors , 1993 .

[11]  Dennis Shasha,et al.  Persistant Linda: Linda + Transactions + Query Processing , 1991, Research Directions in High-Level Parallel Programming Languages.

[12]  Jonathan Walpole,et al.  Recovery with limited replay: fault-tolerant processes in Linda , 1990, Proceedings of the Second IEEE Symposium on Parallel and Distributed Processing 1990.

[13]  RICHARD KOO,et al.  Checkpointing and Rollback-Recovery for Distributed Systems , 1986, IEEE Transactions on Software Engineering.