Clustered K-Center: Effective Replica Placement in Peer-to-Peer Systems

Peer-to-Peer (P2P) systems provide decentralization, self-organization, scalability and failure-resilience, but suffer from high worst-case latencies. Researchers have proposed various replication algorithms to place multiple copies of objects across the network in pursuit of better performance for P2P computing; nevertheless, they neither presented clear analysis nor derived worst-case bound for their algorithms. In this paper, we model the replica placement problem arising in real-world P2P networks as a Clustered K-Center problem which we prove to be NP-complete. Then we propose an efficient approximation algorithm to this problem with a provable upper bound. Extensive experiments have been conducted to demonstrate the effectiveness and efficiency of our algorithm. The experimental results show that our approach can run several orders of magnitude faster than the optimal solution while being able to minimizing the query latency.

[1]  Nabil R. Adam,et al.  Distributed file allocation with consistency constraints , 1992, [1992] Proceedings of the 12th International Conference on Distributed Computing Systems.

[2]  Scott Shenker,et al.  Fixing the Embarrassing Slowness of OpenDHT on PlanetLab , 2005, WORLDS.

[3]  James Aspnes,et al.  Fault-tolerant routing in peer-to-peer systems , 2002, PODC '02.

[4]  Jia Wang,et al.  Analyzing peer-to-peer traffic across large networks , 2004, IEEE/ACM Trans. Netw..

[5]  Michal Young,et al.  Real-time concurrency control with analytic worst-case latency guarantees , 1993 .

[6]  Rajmohan Rajaraman,et al.  Approximation algorithms for data placement in arbitrary networks , 2001, SODA '01.

[7]  I. Lazar,et al.  The state of the Internet , 2000 .

[8]  Uriel Feige,et al.  Approximating the domatic number , 2000, STOC '00.

[9]  Bruce M. Maggs,et al.  Exploiting locality for data management in systems of limited bandwidth , 1997, Proceedings 38th Annual Symposium on Foundations of Computer Science.

[10]  Randy H. Katz,et al.  Dynamic Replica Placement for Scalable Content Delivery , 2002, IPTPS.

[11]  Lixia Zhang,et al.  On the placement of Internet instrumentation , 2000, Proceedings IEEE INFOCOM 2000. Conference on Computer Communications. Nineteenth Annual Joint Conference of the IEEE Computer and Communications Societies (Cat. No.00CH37064).

[12]  Jianliang Xu,et al.  On replica placement for QoS-aware content distribution , 2004, IEEE INFOCOM 2004.

[13]  Jussi Kangasharju,et al.  Object replication strategies in content distribution networks , 2002, Comput. Commun..

[14]  Diomidis Spinellis,et al.  A survey of peer-to-peer content distribution technologies , 2004, CSUR.

[15]  Lawrence W. Dowdy,et al.  Comparative Models of the File Assignment Problem , 1982, CSUR.

[16]  Edsger W. Dijkstra,et al.  A note on two problems in connexion with graphs , 1959, Numerische Mathematik.

[17]  Rajmohan Rajaraman,et al.  Accessing Nearby Copies of Replicated Objects in a Distributed Environment , 1999, Theory of Computing Systems.

[18]  O. Kariv,et al.  An Algorithmic Approach to Network Location Problems. II: The p-Medians , 1979 .

[19]  Mark Handley,et al.  Topologically-aware overlay construction and server selection , 2002, Proceedings.Twenty-First Annual Joint Conference of the IEEE Computer and Communications Societies.

[20]  Mark Handley,et al.  A scalable content-addressable network , 2001, SIGCOMM '01.

[21]  Arun Venkataramani,et al.  Bandwidth constrained placement in a WAN , 2001, PODC '01.

[22]  Ellen W. Zegura,et al.  How to model an internetwork , 1996, Proceedings of IEEE INFOCOM '96. Conference on Computer Communications.

[23]  Teofilo F. GONZALEZ,et al.  Clustering to Minimize the Maximum Intercluster Distance , 1985, Theor. Comput. Sci..

[24]  K. Dan Levin,et al.  Optimizing distributed data bases: a framework for research , 1975, AFIPS '75.

[25]  Sudipto Guha,et al.  Improved combinatorial algorithms for the facility location and k-median problems , 1999, 40th Annual Symposium on Foundations of Computer Science (Cat. No.99CB37039).