A Lightweight Multidimensional Index for Complex Queries over DHTs

In this paper, we study the problem of indexing multidimensional data in P2P networks based on distributed hash tables (DHTs). We advocate the indexing approach that superimposes a multidimensional index tree on top of a DHT - a paradigm that keeps the underlying DHT intact while being able to adapt to any DHT substrate. In this context, we identify several index design issues and propose a novel indexing scheme called multidimensional Lightweight Hash Tree (m-LIGHT). First, to preserve data locality, m-LIGHT employs a clever naming mechanism that gracefully maps a tree-based index into the DHT and contributes to high efficiency in both index maintenance and query processing. Second, to tackle the load balancing issue, m-LIGHT leverages a new data-aware splitting strategy that achieves optimal load balance under a fixed index size. We present detailed algorithms for processing complex queries over the m-LIGHT index. We also conduct an extensive performance evaluation of m-LIGHT in comparison with several state-of-the-art indexing schemes. The experimental results show that m-LIGHT substantially reduces index maintenance overhead and improves query performance in terms of both bandwidth consumption and response latency.

[1]  Manish Parashar,et al.  Flexible information discovery in decentralized distributed systems , 2003, High Performance Distributed Computing, 2003. Proceedings. 12th IEEE International Symposium on.

[2]  Mark Handley,et al.  A scalable content-addressable network , 2001, SIGCOMM '01.

[3]  Shipeng Li,et al.  Distributed Segment Tree: Support of Range Query and Cover Query over DHT , 2006, IPTPS.

[4]  Sriram Ramabhadran,et al.  Brief announcement: prefix hash tree , 2004, PODC '04.

[5]  Beng Chin Ooi,et al.  VBI-Tree: A Peer-to-Peer Framework for Supporting Multi-Dimensional Indexing Schemes , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[6]  Brighten Godfrey,et al.  OpenDHT: a public DHT service and its uses , 2005, SIGCOMM '05.

[7]  Jianliang Xu,et al.  LIGHT: A Query-Efficient Yet Low-Maintenance Indexing Scheme over DHTs , 2010, IEEE Transactions on Knowledge and Data Engineering.

[8]  Sriram Ramabhadran,et al.  A case study in building layered DHT applications , 2005, SIGCOMM '05.

[9]  Hector Garcia-Molina,et al.  One torus to rule them all: multi-dimensional queries in P2P systems , 2004, WebDB '04.

[10]  Shuigeng Zhou,et al.  LHT: A Low-Maintenance Indexing Scheme over DHTs , 2008, 2008 The 28th International Conference on Distributed Computing Systems.

[11]  Anand Sivasubramaniam,et al.  DPTree: A Balanced Tree Based Indexing Framework for Peer-to-Peer Systems , 2006, Proceedings of the 2006 IEEE International Conference on Network Protocols.

[12]  John Kubiatowicz,et al.  Handling churn in a DHT , 2004 .

[13]  Nick Roussopoulos,et al.  Nearest neighbor queries , 1995, SIGMOD '95.

[14]  Praveen Yalagandula,et al.  Solving Range Queries in a Distributed System , 2003 .

[15]  David R. Karger,et al.  Consistent hashing and random trees: distributed caching protocols for relieving hot spots on the World Wide Web , 1997, STOC '97.

[16]  Min Cai,et al.  MAAN: A Multi-Attribute Addressable Network for Grid Information Services , 2003, Journal of Grid Computing.

[17]  Chi Zhang,et al.  Brushwood: Distributed Trees in Peer-to-Peer Systems , 2005, IPTPS.

[18]  James Aspnes,et al.  Skip graphs , 2003, SODA '03.

[19]  Antony I. T. Rowstron,et al.  Pastry: Scalable, Decentralized Object Location, and Routing for Large-Scale Peer-to-Peer Systems , 2001, Middleware.

[20]  Beng Chin Ooi,et al.  BATON: A Balanced Tree Structure for Peer-to-Peer Networks , 2005, VLDB.

[21]  David R. Karger,et al.  Chord: A scalable peer-to-peer lookup service for internet applications , 2001, SIGCOMM '01.

[22]  Jun Gao,et al.  An adaptive protocol for efficient support of range queries in DHT-based systems , 2004, Proceedings of the 12th IEEE International Conference on Network Protocols, 2004. ICNP 2004..

[23]  Artur Andrzejak,et al.  Scalable, efficient range queries for grid information services , 2002, Proceedings. Second International Conference on Peer-to-Peer Computing,.

[24]  Guobin Shen,et al.  Distributed Segment Tree: A Unified Architecture to Support Range Query and Cover Query , 2007 .

[25]  Srinivasan Seshan,et al.  Mercury: supporting scalable multi-attribute range queries , 2004, SIGCOMM '04.