R-HBase: A Multi-dimensional Indexing Framework for Cloud Computing Environment

It has become a challenge to organize and process large scale multi-dimensional data. In this paper, we present R-HBase, a multi-dimensional indexing framework for cloud computing environment. The R-HBase framework consists of storage layer and index layer. The storage layer supports high throughput and the index layer answers query efficiently. R-HBase is evaluated with synthetic data. R-HBase can handle tens of thousands of inserts per second, while efficiently processing multi-dimensional queries. The results also show that R-HBase is faster than MD-HBase and at the same time R-HBase supports multi-dimensional queries with the number of dimensions more than three.

[1]  Antonin Guttman,et al.  R-trees: a dynamic index structure for spatial searching , 1984, SIGMOD '84.

[2]  Peter J. H. King,et al.  Querying multi-dimensional data indexed using the Hilbert space-filling curve , 2001, SGMD.

[3]  Jon Louis Bentley,et al.  Multidimensional binary search trees used for associative searching , 1975, CACM.

[4]  Xiaofeng Meng,et al.  An efficient multi-dimensional index for cloud data management , 2009, CloudDB@CIKM.

[5]  Naphtali Rishe,et al.  Experiences on Processing Spatial Data with MapReduce , 2009, SSDBM.

[6]  Robert E. Wagner,et al.  Indexing Design Considerations , 1973, IBM Syst. J..

[7]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[8]  Jon Louis Bentley,et al.  Quad trees a data structure for retrieval on composite keys , 1974, Acta Informatica.

[9]  Beng Chin Ooi,et al.  Indexing multi-dimensional data in a cloud system , 2010, SIGMOD Conference.

[10]  Wilson C. Hsieh,et al.  Bigtable: A Distributed Storage System for Structured Data , 2006, TOCS.

[11]  Divyakant Agrawal,et al.  $\mathcal{MD}$-HBase: design and implementation of an elastic data infrastructure for cloud-scale location services , 2012, Distributed and Parallel Databases.

[12]  Jonathan K. Lawder Calculation of Mappings Between One and n-dimensional Values Using the Hilbert Space-filling Curve ⋆ , 2009 .

[13]  Jinyun Fang,et al.  Multi-dimensional Index on Hadoop Distributed File System , 2010, 2010 IEEE Fifth International Conference on Networking, Architecture, and Storage.