Efficient Synchronization of Replicated Data in Distributed Systems

We present nsync, a tool for synchronizing large replicated data sets in distributed systems. nsync computes nearly optimal synchronization plans based on a hierarchy of gossip algorithms that take the network topology into account. Our primary design goals were maximum performance and maximum scalability. We achieved these goals by exploiting parallelism in the planning and the synchronization phase, by omitting transfer of unnecessary metadata, by synchronizing at a block level rather than a file level, and by using sophisticated compression methods. With its relaxed consistency semantic, nsync neither needs a master copy nor a quorum for updating distributed replicas. Each replica is kept as an autonomous entity and can be modified with the usual tools.

[1]  Peter Z. Kunszt,et al.  Giggle: A Framework for Constructing Scalable Replica Location Services , 2002, ACM/IEEE SC 2002 Conference (SC'02).

[2]  Pierre Fraigniaud,et al.  Methods and problems of communication in usual networks , 1994, Discret. Appl. Math..

[3]  Ian Foster,et al.  A Decentralized, Adaptive, Replica Location Service , 2002 .

[4]  Ian T. Foster,et al.  A decentralized, adaptive replica location mechanism , 2002, Proceedings 11th IEEE International Symposium on High Performance Distributed Computing.

[5]  Brenda S. Baker,et al.  Gossips and telephones , 1972, Discret. Math..

[6]  Dawid Weiss,et al.  On the performance and scalability of a data mirroring approach for I2-DSI , 1999 .

[7]  Peter L. Reiher,et al.  Rumor: Mobile Data Access Through Optimistic Peer-to-Peer Replication , 1998, ER Workshops.

[8]  Gustavo Alonso,et al.  How to select a replication protocol according to scalability, availability and communication overhead , 2001, Proceedings 20th IEEE Symposium on Reliable Distributed Systems.

[9]  Florian Schintke,et al.  On the Cost of Reliability in Large Data Grids , 2002 .

[10]  David R. Karger,et al.  Chord: A scalable peer-to-peer lookup service for internet applications , 2001, SIGCOMM '01.

[11]  David W. Krumme,et al.  Gossiping in Minimal Time , 1992, SIAM J. Comput..

[12]  Mark Handley,et al.  A scalable content-addressable network , 2001, SIGCOMM '01.

[13]  Andrew Tridgell,et al.  Efficient Algorithms for Sorting and Synchronization , 1999 .

[14]  Thorsten Schütt Synchronisation von verteilten Verzeichnisstrukturen , 2002 .

[15]  Juraj Hromkovič,et al.  Dissemination of Information in Interconnection Networks (Broadcasting & Gossiping) , 1996 .