A PR-quadtree based multi-dimensional indexing for complex query in a cloud system

The state-of-the-art indexing mechanisms for distributed cloud data management systems can not support complex queries, such as multi-dimensional query and range query. To solve this problem, we propose a multi-dimensional indexing mechanism named PR-Chord to support complex queries. PR-Chord is composed of the global index named PR-Index and the Chord network. The multi-dimensional space formed by the range of the multi-dimensional data is divided into hyper-rectangle spaces equally. The PR-Index is a hierarchical index structure based on the improved PR quadtree to index these spaces. The complex query is transformed into the query of leaf nodes of PR-Index. We design the algorithms of query, insertion and deletion to support complex queries. Since PR-Index does not store the multi-dimensional data, its maintenance cost is zero. PR-Chord has the advantages of load balancing and simple algorithm. The experiment results demonstrate that PR-Chord has good query efficiency.

[1]  Xiaofeng Meng,et al.  An efficient multi-dimensional index for cloud data management , 2009, CloudDB@CIKM.

[2]  Jon Louis Bentley,et al.  Analysis of Range Searches in Quad Trees , 1975, Inf. Process. Lett..

[3]  David R. Karger,et al.  Chord: A scalable peer-to-peer lookup service for internet applications , 2001, SIGCOMM '01.

[4]  Luo Junzhou,et al.  A Multi-Dimensional Indexing for Complex Query in Cloud Computing , 2013 .

[5]  Prashant Malik,et al.  Cassandra: a decentralized structured storage system , 2010, OPSR.

[6]  Hanan Samet,et al.  Foundations of multidimensional and metric data structures , 2006, Morgan Kaufmann series in data management systems.

[7]  Ben Y. Zhao,et al.  An Infrastructure for Fault-tolerant Wide-area Location and Routing , 2001 .

[8]  Wilson C. Hsieh,et al.  Bigtable: A Distributed Storage System for Structured Data , 2006, TOCS.

[9]  Marcos K. Aguilera,et al.  A practical scalable distributed B-tree , 2008, Proc. VLDB Endow..

[10]  Hanan Samet,et al.  Using a distributed quadtree index in peer-to-peer networks , 2007, The VLDB Journal.

[11]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[12]  Wang-Chien Lee,et al.  Key Formulation Schemes for Spatial Index in Cloud Data Managements , 2012, 2012 IEEE 13th International Conference on Mobile Data Management.

[13]  GhemawatSanjay,et al.  The Google file system , 2003 .

[14]  Robert Morris,et al.  Chord: A scalable peer-to-peer lookup service for internet applications , 2001, SIGCOMM 2001.

[15]  Werner Vogels,et al.  Dynamo: amazon's highly available key-value store , 2007, SOSP.

[16]  Beng Chin Ooi,et al.  Efficient B-tree based indexing for cloud data processing , 2010, Proc. VLDB Endow..

[17]  David R. Karger,et al.  Kademlia: A peer-to-peer information system based on the xor metric , 2003 .

[18]  Salvador Roura,et al.  Quad-kd trees: A general framework for kd trees and quad trees , 2016, Theor. Comput. Sci..

[19]  Chak-Kuen Wong,et al.  Worst-case analysis for region and partial region searches in multidimensional binary search trees and balanced quad trees , 1977, Acta Informatica.

[20]  Mark Handley,et al.  A scalable content-addressable network , 2001, SIGCOMM 2001.

[21]  Jin Liu,et al.  A Multi-Source Approach for Bug Triage , 2016, Int. J. Softw. Eng. Knowl. Eng..

[22]  Antony I. T. Rowstron,et al.  Pastry: Scalable, Decentralized Object Location, and Routing for Large-Scale Peer-to-Peer Systems , 2001, Middleware.

[23]  Beng Chin Ooi,et al.  Indexing multi-dimensional data in a cloud system , 2010, SIGMOD Conference.

[24]  Ali Dehghantanha,et al.  Ensemble-based multi-filter feature selection method for DDoS detection in cloud computing , 2016, EURASIP Journal on Wireless Communications and Networking.

[25]  Guoren Wang,et al.  An Efficient Quad-Tree Based Index Structure for Cloud Data Management , 2011, WAIM.

[26]  Divyakant Agrawal,et al.  $\mathcal{MD}$-HBase: design and implementation of an elastic data infrastructure for cloud-scale location services , 2012, Distributed and Parallel Databases.

[27]  Laura Ricci,et al.  Future Generation Computer Systems , 2015 .