Analysis of Replicated Data with Repair Dependency

Pessimistic control algorithms for replicated data permit only one partition to perform update operations at any time so as to ensure mutual exclusion of the replicated data object. Existing availability modelling and analyses of pessimistic control algorithms for replicated data management are constrained to either site-failure-only or link-failure-only models, but not both, because of the large state space which needs to be considered. Moreover, the assumption of having an independent repairman for each link and each site has been made to reduce the complexity of analysis. In this paper, we remove these restrictions with the help of stochastic Petri nets. In addition to including both site and link failures/repairs events in our analysis, we investigate the effect of repair dependency which occurs when many sites and links may have to share the same repairman due to repair constraints. Four repairman models are examined in the paper: (a) independent repairman with one repairman assigned to each link and each node; (b) dependent repairman with FIFO servicing discipline; (c) dependent repairman with linear-order servicing discipline; and (d) dependent repairman with best-first servicing discipline. Using dynamic voting as a case study, we compare and contrast the resulting availabilities due to the use of these four different repairman models and give a physical interpretation of the differences. We show that ignoring concurrent site and link failures/repairs events or repair dependency can very unrealistically overestimate the availability of replicated data.

[1]  Mostafa H. Ammar,et al.  The Grid Protocol: A High Performance Scheme for Maintaining Replicated Data , 1992, IEEE Trans. Knowl. Data Eng..

[2]  Sushil Jajodia,et al.  A Pessimistic Consistency Control Algorithm for Replicated Files which Achieves High Availability , 1989, IEEE Trans. Software Eng..

[3]  Peter Triantafillou,et al.  The Location Based Paradigm for Replication: Achieving Efficiency and Availability in Distributed Systems , 1995, IEEE Trans. Software Eng..

[4]  Satish K. Tripathi,et al.  A fault-tolerant algorithm for replicated data management , 1992, [1992] Eighth International Conference on Data Engineering.

[5]  Kishor S. Trivedi,et al.  Analyzing Concurrent and Fault-Tolerant Software Using Stochastic Reward Nets , 1992, J. Parallel Distributed Comput..

[6]  William H. Sanders,et al.  Dependability Evaluation Using Composed SAN-Based Reward Models , 1992, J. Parallel Distributed Comput..

[7]  Kishor S. Trivedi,et al.  SPNP: stochastic Petri net package , 1989, Proceedings of the Third International Workshop on Petri Nets and Performance Models, PNPM89.

[8]  Sushil Jajodia,et al.  A Hybrid Replica Control Algorithm Combining Static and Dynamic Voting , 1989, IEEE Trans. Knowl. Data Eng..

[9]  Leonard Kleinrock,et al.  Queueing Systems: Volume I-Theory , 1975 .

[10]  Her-Kun Chang,et al.  Performance Characterization of the Tree Quorum Algorithm , 1995, IEEE Trans. Parallel Distributed Syst..

[11]  S. Wittevrongel,et al.  Queueing Systems , 2019, Introduction to Stochastic Processes and Simulation.

[12]  Sushil Jajodia,et al.  Dynamic voting algorithms for maintaining the consistency of a replicated database , 1990, TODS.

[13]  C. Q. Lee,et al.  The Computer Journal , 1958, Nature.

[14]  Nabil R. Adam,et al.  A New Dynamic Voting Algorithm for Distributed Database Systems , 1994, IEEE Trans. Knowl. Data Eng..

[15]  Ada Wai-Chee Fu Delay-Optimal Quorum Consensus for Distributed Systems , 1997, IEEE Trans. Parallel Distributed Syst..

[16]  Gianfranco Ciardo,et al.  Stochastic Petri Net Analysis of a Replicated File System , 1989, IEEE Trans. Software Eng..

[17]  Akhil Kumar,et al.  Hierarchical Quorum Consensus: A New Algorithm for Managing Replicated Data , 1991, IEEE Trans. Computers.

[18]  Sushil Jajodia,et al.  An Algorithm for Dynamic Data Allocation in Distributed Systems , 1995, Inf. Process. Lett..

[19]  Ing-Ray Chen,et al.  Analyzing dynamic voting using Petri nets , 1996, Proceedings 15th Symposium on Reliable Distributed Systems.

[20]  Hector Garcia-Molina,et al.  How to assign votes in a distributed system , 1985, JACM.