A Local Facility Location Algorithm for Large-scale Distributed Systems

In a facility location problem (FLP) we are given a set of facilities and a set of clients, each of which is to be served by one facility. The goal is to decide which subset of facilities to open, such that the clients will be served at a minimal cost. In this paper we investigate the FLP in a setting where the cost depends on data known only to the clients. This setting typifies modern distributed systems: peer-to-peer file sharing networks, Grid systems, and wireless sensor networks. All of them need to perform network organization, data placement, collective power management, and other tasks of this kind. We propose a local and efficient algorithm that solves FLP in these settings. The algorithm presented here is extremely scalable, entirely decentralized, requires no routing capabilities, and is resilient to failures and changes in the data throughout its execution.

[1]  A. Schuster,et al.  Association rule mining in peer-to-peer systems , 2004, IEEE Trans. Syst. Man Cybern. Part B.

[2]  Anukool Lakhina,et al.  BRITE: Universal Topology Generation from a User''s Perspective , 2001 .

[3]  Domenico Talia,et al.  Scalable Parallel Clustering for Data Mining on Multicomputers , 2000, IPDPS Workshops.

[4]  Jeffrey M. Jaffe,et al.  A Responsive Distributed Routing Algorithm for Computer Networks , 1982, ICDCS.

[5]  Jon M. Kleinberg,et al.  A Microeconomic View of Data Mining , 1998, Data Mining and Knowledge Discovery.

[6]  Shay Kutten,et al.  Fault-local distributed mending (extended abstract) , 1995, PODC '95.

[7]  Boaz Patt-Shamir,et al.  Self-stabilization by local checking and correction , 1991, [1991] Proceedings 32nd Annual Symposium of Foundations of Computer Science.

[8]  Inderjit S. Dhillon,et al.  A Data-Clustering Algorithm on Distributed Memory Multiprocessors , 1999, Large-Scale Parallel Data Mining.

[9]  Baruch Awerbuch,et al.  Compact distributed data structures for adaptive routing , 1989, STOC '89.

[10]  Clive G. Page Astrogrid and data mining , 2001, SPIE Optics + Photonics.

[11]  Panganamala Ramana Kumar,et al.  RHEINISCH-WESTFÄLISCHE TECHNISCHE HOCHSCHULE AACHEN , 2001 .

[12]  Samir Khuller,et al.  Greedy strikes back: improved facility location algorithms , 1998, SODA '98.

[13]  Hans-Peter Kriegel,et al.  Incremental Clustering for Mining in a Data Warehousing Environment , 1998, VLDB.

[14]  Roger Wattenhofer,et al.  Facility location: distributed approximation , 2005, PODC '05.

[15]  Boaz Patt-Shamir,et al.  Time-adaptive self stabilization , 1997, PODC '97.

[16]  Kamesh Munagala,et al.  Local search heuristic for k-median and facility location problems , 2001, STOC '01.

[17]  Roger Wattenhofer,et al.  What cannot be computed locally! , 2004, PODC '04.

[18]  M. Kaufmann What Can Be Computed Locally ? , 2003 .

[19]  Rajmohan Rajaraman,et al.  Analysis of a local search heuristic for facility location problems , 2000, SODA '98.

[20]  Nathan Linial,et al.  Locality in Distributed Graph Algorithms , 1992, SIAM J. Comput..

[21]  Sudipto Guha,et al.  Improved combinatorial algorithms for the facility location and k-median problems , 1999, 40th Annual Symposium on Foundations of Computer Science (Cat. No.99CB37039).

[22]  Vijay V. Vazirani,et al.  Primal-dual approximation algorithms for metric facility location and k-median problems , 1999, 40th Annual Symposium on Foundations of Computer Science (Cat. No.99CB37039).

[23]  Bin Zhang,et al.  Distributed data clustering can be efficient and exact , 2000, SKDD.

[24]  Ran Wolff,et al.  A Local Algorithm for Ad Hoc Majority Voting via Charge Fusion , 2004, DISC.

[25]  David R. Karger,et al.  Koorde: A Simple Degree-Optimal Distributed Hash Table , 2003, IPTPS.