The data partition strategy based on hybrid range consistent hash in NoSQL database

With the development of Internet technology and Cloud Computing, more and more applications have to be confronted with the challenges of big data. NoSQL Database is fit to the management of big data because of the characteristics of high scalability, high availability and high fault-tolerance. The data partitioning strategy plays an important role in the NoSQL database. The existing data partitioning strategies will cause some problems such as low scalability, hot spot and low performance and so on. In this paper we proposed a new data partitioning strategy---HRCH, which can partitioning the data in a reasonable way. At last we use some experiments to verify the effectiveness of HRCH. It shows that the HRCH can improve the scalability of the system. It also can avoid the hot spot problem as far as possible. And it also can improve the parallel degree of processing to improve the system's performance in some processing.

[1]  David R. Karger,et al.  Consistent hashing and random trees: distributed caching protocols for relieving hot spots on the World Wide Web , 1997, STOC '97.

[2]  Patrick Valduriez,et al.  Dynamic Workload-Based Partitioning for Large-Scale Databases , 2012, DEXA.

[3]  Alfons Kemper,et al.  Community Training: Partitioning Schemes in Good Shape for Federated Data Grids , 2007, Third IEEE International Conference on e-Science and Grid Computing (e-Science 2007).

[4]  Prashant Malik,et al.  Cassandra: structured storage system on a P2P network , 2009, PODC '09.

[5]  David J. DeWitt,et al.  Hybrid-Range Partitioning Strategy: A New Declustering Strategy for Multiprocessor Database Machines , 1990, VLDB.

[6]  Carlo Curino,et al.  Relational Cloud: a Database Service for the cloud , 2011, CIDR.

[7]  Wilson C. Hsieh,et al.  Bigtable: A Distributed Storage System for Structured Data , 2006, TOCS.

[8]  Helmut Krcmar,et al.  Big Data , 2014, Wirtschaftsinf..

[9]  Werner Vogels,et al.  Dynamo: amazon's highly available key-value store , 2007, SOSP.

[10]  Hans-Arno Jacobsen,et al.  PNUTS: Yahoo!'s hosted data serving platform , 2008, Proc. VLDB Endow..

[11]  Wei Lin,et al.  Advanced partitioning techniques for massively distributed computation , 2012, SIGMOD Conference.

[12]  Keith Gordon,et al.  What is Big Data , 2013 .

[13]  Adam Silberstein,et al.  Benchmarking cloud serving systems with YCSB , 2010, SoCC '10.

[14]  Divyakant Agrawal,et al.  MD-HBase: A Scalable Multi-dimensional Data Infrastructure for Location Aware Services , 2011, 2011 IEEE 12th International Conference on Mobile Data Management.

[15]  Thomas C. Bressoud,et al.  Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles , 2007, SOSP 2007.