Balanced binary trees for ID management and load balance in distributed hash tables

We present a low-cost, decentralized algorithm for ID management in distributed hash tables (DHTs) managed by a dynamic set of hosts. Each host is assigned an ID in the unit interval [0, 1). At any time, the set of IDs splits the interval into disjoint partitions. Hosts do not possess global knowledge of other IDs in the system. The challenge then is to design an efficient decentralized algorithm that maintains roughly equi-sized partitions, in the face of arrivals, departures and changes in the average number of hosts.Our ID management algorithm is the first to enjoy all of the following properties: (a) both arrivals and departures of hosts are handled, (b) departure of a host causes at most one existing host to change its ID, (c) the ratio of the largest to the smallest partition is at most 4, with high probability, and (d) the expected cost per arrival/departure is Θ(R + log n) messages, where n denotes the current number of participants, and R denotes the cost of routing one message in the DHT. In fact, our algorithm is independent of the topology of the overlay network used for routing.Variations of our algorithm diminish the ratio between the largest and the smallest partition to (1 + ε), for any ε > 0, albeit at the cost of re-assigning the IDs of O(1/ε) existing hosts per arrival/departure. Ours is the first algorithm that allows such fine-tuning.Finally, our ID management algorithm enables (a) estimation of the total number of hosts in the system by making only local measurements, and (b) emulation of a variety of deterministic and randomized families of routing topologies, in a straightforward fashion. Among these families are several networks that require O(log n/log k) routing hops in an n-node network with k links per node.

[1]  Dahlia Malkhi,et al.  Estimating network size from local information , 2003, Information Processing Letters.

[2]  Mark Handley,et al.  A scalable content-addressable network , 2001, SIGCOMM '01.

[3]  Rajeev Motwani,et al.  Randomized Algorithms , 1995, SIGA.

[4]  F. Leighton,et al.  Introduction to Parallel Algorithms and Architectures: Arrays, Trees, Hypercubes , 1991 .

[5]  Moni Naor,et al.  Viceroy: a scalable and dynamic emulation of the butterfly , 2002, PODC '02.

[6]  Moni Naor,et al.  Novel architectures for P2P applications: the continuous-discrete approach , 2003, SPAA '03.

[7]  Rudolf Bayer,et al.  Organization and maintenance of large ordered indexes , 1972, Acta Informatica.

[8]  Yossi Azar,et al.  A generic scheme for building overlay networks in adversarial scenarios , 2003, Proceedings International Parallel and Distributed Processing Symposium.

[9]  Gurmeet Singh Manku,et al.  Routing networks for distributed hash tables , 2003, PODC '03.

[10]  David R. Karger,et al.  Chord: A scalable peer-to-peer lookup service for internet applications , 2001, SIGCOMM '01.

[11]  Richard M. Karp,et al.  A stochastic process on the hypercube with applications to peer-to-peer networks , 2003, STOC '03.

[12]  Gurmeet Singh Manku,et al.  Symphony: Distributed Hashing in a Small World , 2003, USENIX Symposium on Internet Technologies and Systems.

[13]  Michael B. Jones,et al.  SkipNet: A Scalable Overlay Network with Practical Locality Properties , 2003, USENIX Symposium on Internet Technologies and Systems.

[14]  Jon M. Kleinberg,et al.  The small-world phenomenon: an algorithmic perspective , 2000, STOC '00.

[15]  David R. Karger,et al.  Wide-area cooperative storage with CFS , 2001, SOSP.

[16]  Krishna P. Gummadi,et al.  The impact of DHT routing geometry on resilience and proximity , 2003, SIGCOMM '03.

[17]  Ramesh Govindan,et al.  Incrementally improving lookup latency in distributed hash table systems , 2003, SIGMETRICS '03.

[18]  David R. Karger,et al.  New Algorithms for Load Balancing in Peer-to-Peer Systems , 2003 .

[19]  Moni Naor,et al.  Know thy neighbor's neighbor: the power of lookahead in randomized P2P networks , 2004, STOC '04.

[20]  Ben Y. Zhao,et al.  Tapestry: a resilient global-scale overlay for service deployment , 2004, IEEE Journal on Selected Areas in Communications.

[21]  Pierre Fraigniaud,et al.  Brief announcement: an overview of the content-addressable network D2B , 2003, PODC '03.

[22]  Antony I. T. Rowstron,et al.  Pastry: Scalable, Decentralized Object Location, and Routing for Large-Scale Peer-to-Peer Systems , 2001, Middleware.

[23]  Gurmeet Singh Manku,et al.  Optimal routing in Chord , 2004, SODA '04.

[24]  David R. Karger,et al.  Koorde: A Simple Degree-Optimal Distributed Hash Table , 2003, IPTPS.

[25]  Dmitri Loguinov,et al.  Graph-theoretic analysis of structured peer-to-peer systems: routing distances and fault resilience , 2003, IEEE/ACM Transactions on Networking.