HBaseSpatial: A Scalable Spatial Data Storage Based on HBase

Recent years, the scale of spatial data is developing more and more huge and its storage has encountered a lot of problems. Traditional DBMS can efficiently handle some big spatial data. However, popular open source relational database systems are overwhelmed by the high insertion rates, querying requirements and terabytes of data that these systems can handle. On the other hand, key-value storage can effectively support large scale operations. To resolve the problems of big vector spatial data's storage and query, we bring forward HBase Spatial, a scalable spatial dada storage based on HBase. At first, we analyze the distributed storage model of HBase. Then, we design a distributed storage and index model. Finally, the advantages of our storage model and index algorithm are proven by experiments with both big sample sets and typical benchmarks on cluster compared with MongoDB and Mysql, which shows that our model can effectively enhance the query speed of big spatial data and provide a good solution for storage.

[1]  Xu Han,et al.  An efficient index for massive IOT data in cloud environment , 2012, CIKM '12.

[2]  Antonin Guttman,et al.  R-trees: a dynamic index structure for spatial searching , 1984, SIGMOD '84.

[3]  Bruce Momjian,et al.  PostgreSQL: Introduction and Concepts , 2000 .

[4]  OGC : A Framework for Geospatial and Statistical Information Integration , 2004 .

[5]  Lionel M. Ni,et al.  CloST: a hadoop-based storage system for big spatio-temporal data analytics , 2012, CIKM '12.

[6]  Ronald C. Taylor An overview of the Hadoop/MapReduce/HBase framework and its current applications in bioinformatics , 2010, BMC Bioinformatics.

[7]  Divyakant Agrawal,et al.  MD-HBase: A Scalable Multi-dimensional Data Infrastructure for Location Aware Services , 2011, 2011 IEEE 12th International Conference on Mobile Data Management.

[8]  Ralf Hartmut Güting,et al.  An introduction to spatial database systems , 1994, VLDB J..

[9]  Amin Anjomshoaa,et al.  How the cloud computing paradigm could shape the future of enterprise information processing , 2011, MoMM '11.

[10]  Lei Ren,et al.  Massive sensor data management framework in Cloud manufacturing based on Hadoop , 2012, IEEE 10th International Conference on Industrial Informatics.

[11]  Regina O. Obe,et al.  PostGIS in Action , 2011 .

[12]  Zhang Mingbo Analysis and Discussion on Spatial Data Engine Technologies , 2004 .

[13]  Guihai Chen,et al.  Towards Parallel Spatial Query Processing for Big Spatial Data , 2012, 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum.

[14]  Goran Horvat,et al.  GeoHash and UUID Identifier for Multi-Agent Systems , 2012, KES-AMSTA.

[15]  Massimo Carro,et al.  NoSQL Databases , 2014, ArXiv.

[16]  Rick Cattell,et al.  Scalable SQL and NoSQL data stores , 2011, SGMD.

[17]  Kristina Chodorow,et al.  MongoDB: The Definitive Guide , 2010 .

[18]  Jiun-Long Huang,et al.  Spatial Query Processing on Distributed Databases , 2013 .

[19]  Yan Li,et al.  MHB-Tree: A Distributed Spatial Index Method for Document Based NoSQL Database System , 2013 .

[20]  Suprio Ray,et al.  Jackpine: A benchmark to evaluate spatial database performance , 2011, 2011 IEEE 27th International Conference on Data Engineering.

[21]  Hossam S. Hassanein,et al.  CrowdITS: Crowdsourcing in intelligent transportation systems , 2012, 2012 IEEE Wireless Communications and Networking Conference (WCNC).

[22]  Vldb Endowment,et al.  The VLDB journal : the international journal on very large data bases. , 1992 .

[23]  Kaladhar Voruganti,et al.  Implementation and evaluation of scalable data structure over HBase , 2012, ICACCI '12.

[24]  Lars George,et al.  HBase: The Definitive Guide , 2011 .

[25]  Xiaomin Zhu,et al.  Elastic and effective spatio-temporal query processing scheme on Hadoop , 2012, BigSpatial '12.

[26]  Jinyun Fang,et al.  A distributed geospatial data storage and processing framework for large-scale WebGIS , 2012, 2012 20th International Conference on Geoinformatics.

[27]  M. N. Vora,et al.  Hadoop-HBase for large-scale data , 2011, Proceedings of 2011 International Conference on Computer Science and Network Technology.

[28]  Yonggang Wang,et al.  Research and implementation on spatial data storage and operation based on Hadoop platform , 2010, 2010 Second IITA International Conference on Geoscience and Remote Sensing.

[29]  Eleni Stroulia,et al.  A three-dimensional data model in HBase for large time-series dataset analysis , 2012, 2012 IEEE 6th International Workshop on the Maintenance and Evolution of Service-Oriented and Cloud-Based Systems (MESOCA).

[30]  Wan D. Bae,et al.  MobiS: a distributed paradigm of mobile sensor data analytics for evaluating environmental exposures , 2012, MobiGIS.

[31]  Daniel Bartholomew,et al.  SQL vs. NoSQL , 2010 .