论文信息 - Programming Partition-Aware Network Applications

Programming Partition-Aware Network Applications

We consider the problem of developing reliable applications to be deployed in partitionable asynchronous distributed systems. What makes this task difficult is guaranteeing the consistency of shared state despite asynchrony, failures and recoveries, including the formation and merging of partitions. While view synchrony within process groups is a powerful paradigm that can significantly simplify reasoning about asynchrony and failures, it is insufficient for coping with recoveries and merging of partitions after repairs. We first give an abstract characterization for shared state management in partitionable asynchronous distributed systems and then show how views can be enriched to convey structural and historical information relevant to the group's activity. The resulting paradigm, called enriched view synchrony, can be implemented efficiently and leads to a simple programming methodology for solving shared state management in the presence of partitions.

Gianluca Dini | Alberto Bartoli | Özalp Babaoglu

[1] Amr El Abbadi,et al. Maintaining availability in partitioned replicated databases , 1987, ACM Trans. Database Syst..

[2] Gianluca Dini,et al. Replicated File Management in Large-Scale Distributed Systems , 1994, WDAG.

[3] Sam Toueg,et al. Unreliable failure detectors for asynchronous systems (preliminary version) , 1991, PODC '91.

[4] André Schiper,et al. Uniform reliable multicast in a virtually synchronous environment , 1993, [1993] Proceedings. The 13th International Conference on Distributed Computing Systems.

[5] Kenneth P. Birman,et al. Understanding partitions and the 'no partition' assumption , 1993, 1993 4th Workshop on Future Trends of Distributed Computing Systems.

[6] Rachid Guerraoui,et al. Software-Based Replication for Fault Tolerance , 1997, Computer.

[7] Idit Keidar,et al. Increasing the resilience of atomic commit, at no additional cost , 1995, PODS '95.

[8] Flaviu Cristian,et al. An efficient, fault-tolerant protocol for replicated data management , 1985, PODS '85.

[9] André Schiper,et al. Virtually-synchronous communication based on a weak failure suspector , 1993, FTCS-23 The Twenty-Third International Symposium on Fault-Tolerant Computing.

[10] Louise E. Moser,et al. The Totem single-ring ordering and membership protocol , 1995, TOCS.

[11] Fred B. Schneider,et al. Implementing fault-tolerant services using the state machine approach: a tutorial , 1990, CSUR.