论文信息 - The cost of data replication

The cost of data replication

With the advent of data communication networks, researchers have been looking at the possibility of placing copies of a database at two or more nodes of a network. Such data replication is interesting because it makes the database accessible even when some of the nodes in the system fail. Furthermore, transactions which only read data may get faster access to the data when multiple copies exist. Due to the complex and time consuming update protocols, and due to the additional required hardware, data replication has a definite cost. This cost factor is often overlooked in discussions on replicated data, so in this paper we examine some of the cost issues involved. We argue that the cost of replicating data is so high that only in very special cases will data be replicated at different nodes of a computer communication network and kept in a consistent state. We also discuss some alternate approaches to data replication (like data replication at a single node and shadow copies).

Hector Garcia-Molina | Daniel Barbará

[1] Y. Matsushita,et al. A hierarchical structure for concurrency control in a distributed database system , 1979, SIGCOMM '79.

[2] George Gardarin,et al. A reliable distributed control algorithm for updating replicated databases , 1979, SIGCOMM '79.

[3] Irving L. Traiger,et al. The notions of consistency and predicate locks in a database system , 1976, CACM.

[4] H ThomasRobert. A Majority consensus approach to concurrency control for multiple copy databases , 1979 .

[5] Philip A. Bernstein,et al. Fundamental Algorithms for Concurrency Control in Distributed Database Systems. , 1980 .

[6] Butler W. Lampson,et al. Crash Recovery in a Distributed Data Storage System , 1981 .

[7] Clarence A. Ellis,et al. Consistency and correctness of duplicate database systems , 1977, SOSP '77.

[8] Leslie Lamport,et al. Time, clocks, and the ordering of events in a distributed system , 1978, CACM.