A Class of Randomized Strategies for Low-Cost Comparison of File Copies

A class of algorithms that use randomized signatures to compare remotely located file copies is presented. A simple technique that sends on the order of 4/sup f/log(n) bits, where f is the number of differing pages that are to be diagnosed and n is the number of pages in the file, is described. A method to improve the bound in the number of bits sent, making them grow with f as flog(f) and with n as log(n)log(log(n)), and a class of algorithms in which the number of signatures grows with f as fr/sup f/, where r can be made to approach 1, are also presented. A comparison of these techniques is discussed. >

[1]  John J. Metzner,et al.  A Parity Structure for Large Remotely Located Replicated Data Files , 1983, IEEE Transactions on Computers.

[2]  Kun-Lung Wu,et al.  Low-Cost Comparison and Diagnosis of Large Remotely Located Files , 1986, Symposium on Reliability in Distributed Software and Database Systems.

[3]  H. Chernoff A Measure of Asymptotic Efficiency for Tests of a Hypothesis Based on the sum of Observations , 1952 .

[4]  W. W. Peterson,et al.  Error-Correcting Codes. , 1962 .

[5]  Hector Garcia-Molina,et al.  Database Processing with Triple Modular Redundancy , 1986, Symposium on Reliability in Distributed Software and Database Systems.

[6]  Hector Garcia-Molina,et al.  Exploiting symmetries for low-cost comparison of file copies , 1988, [1988] Proceedings. The 8th International Conference on Distributed.