A performance study of three high availability data replication strategies

Several data replication strategies have been proposed to provide high data availability for database applications. However, the trade-offs among the different strategies for various workloads and different operating modes have not been studied before. In this paper, we study the relative performance of three high availability data replication strategies, chained declustering, mirrored disks, and interleaved declustering, in a shared nothing database machine environment. In particular, we have examined (1) the relative performance of the three strategies when no failures have occurred, (2) the effect of load imbalance caused by a disk or processor failure on system throughput and response time, and (3) the tradeoff between the benefit of intra query parallelism and the overhead of activating and scheduling extra operator process. Experimental results obtained from a simulation study indicate that, in the normal mode of operation, chained declustering and interleaved declustering perform comparably. Both perform better than mirrored disks if an application is I/O bound, but slightly worse than mirrored disks if the application is CPU bound. In the event of a disk failure, because chained declustering is able to balance the workload among all remaining operational disks while the other two cannot, it provides noticeably better performance than interleaved declustering and much better performance than mirrored disks.

[1]  Donovan A. Schneider,et al.  The Gamma Database Machine Project , 1990, IEEE Trans. Knowl. Data Eng..

[2]  David J. DeWitt,et al.  GAMMA - A High Performance Dataflow Database Machine , 1986, VLDB.

[3]  Hongjun Lu,et al.  Load balancing in a locally distributed DB system , 1986, SIGMOD '86.

[4]  Dina Bitton,et al.  Arm scheduling in shadowed disks , 1989, Digest of Papers. COMPCON Spring 89. Thirty-Fourth IEEE Computer Society International Conference: Intellectual Leverage.

[5]  Michelle Y. Kim,et al.  Synchronized Disk Interleaving , 1986, IEEE Transactions on Computers.

[6]  Tom W. Keller,et al.  Data placement in Bubba , 1988, SIGMOD '88.

[7]  이헌,et al.  [기술동향 소개]Fault Tolerant Computing System , 1985 .

[8]  Miron Livny,et al.  Multi-disk management algorithms , 1987, SIGMETRICS '87.

[9]  C. Laas Fault Tolerant Computing , 2000 .

[10]  Randy H. Katz,et al.  An evaluation of redundant arrays of disks using an Amdahl 5890 , 1990, SIGMETRICS '90.

[11]  Michael Stonebraker,et al.  A measure of transaction processing power , 1985 .

[12]  Andrea J. Borr Transaction Monitoring in ENCOMPASS: Reliable Distributed Transaction Processing , 1981, VLDB.

[13]  Dina Bitton,et al.  Disk Shadowing , 1988, VLDB.

[14]  David J. DeWitt,et al.  A performance analysis of the gamma database machine , 1988, SIGMOD '88.

[15]  Michael Stonebraker,et al.  Implementation of integrity constraints and views by query modification , 1975, SIGMOD '75.

[16]  Jim Gray,et al.  Parity Striping of Disk Arrays: Low-Cost Reliable Storage with Acceptable Throughput , 1990, VLDB.

[17]  Randy H. Katz,et al.  A case for redundant arrays of inexpensive disks (RAID) , 1988, SIGMOD '88.

[18]  Terry Williams,et al.  Probability and Statistics with Reliability, Queueing and Computer Science Applications , 1983 .

[19]  J CareyMichael,et al.  Parallelism and concurrency control performance in distributed database machines , 1989 .

[20]  Michael Stonebraker,et al.  The Case for Shared Nothing , 1985, HPTS.

[21]  Miron Livny,et al.  Parallelism and concurrency control performance in distributed database machines , 1989, SIGMOD '89.

[22]  Michael Stonebraker,et al.  Distributed RAID-a new multiple copy algorithm , 1990, [1990] Proceedings. Sixth International Conference on Data Engineering.

[23]  Hongjun Lu,et al.  Load Balancing in a Locally Distributed Database System , 1986, SIGMOD Conference.

[24]  Tom W. Keller,et al.  A comparison of high-availability media recovery techniques , 1989, SIGMOD '89.

[25]  David J. DeWitt,et al.  Chained declustering: a new availability strategy for multiprocessor database machines , 1990, [1990] Proceedings. Sixth International Conference on Data Engineering.