An Adaptive Partition-Based Caching Approach for Efficient Range Queries on Key-Value Data

Range queries are real demands in big data scenarios, such as analytic and time-traveling queries over web archives. Here we design AdaSI, an adaptive partition-based caching approach for efficient range queries on key-value data. AdaSI partitions data into a number of data slices (consecutive data items). Then the AdaSI Hotscore Algorithm is designed to maximize the cache-hit probability under the limitation of cache space. By measuring Dutyrate and Hotscore of data slice, the partitioning precision and adjustment sensitivity are pursued by finer partitioning on hot data, whereas the cold data are partitioned with relatively larger granularity to reduce storage overhead and search cost of queries. Our results show that the AdaSI Hotscore Algorithm could obtain a cache hit rate nearly as high as the record-based cache policies, as well as a significant speedup and space reduction, far outperforming record-based policies.

[1]  Xiaokui Xiao,et al.  LSII: An indexing structure for exact real-time search on microblogs , 2013, 2013 IEEE 29th International Conference on Data Engineering (ICDE).

[2]  Adam Silberstein,et al.  Benchmarking cloud serving systems with YCSB , 2010, SoCC '10.

[3]  Radu Stoica,et al.  Identifying hot and cold data in main-memory databases , 2013, 2013 IEEE 29th International Conference on Data Engineering (ICDE).

[4]  Kenneth A. Ross,et al.  SSD bufferpool extensions for database systems , 2010, Proc. VLDB Endow..

[5]  Jae-Gil Lee,et al.  Joins on Encoded and Partitioned Data , 2014, Proc. VLDB Endow..

[6]  Peter Triantafillou,et al.  Interval indexing and querying on key-value cloud stores , 2013, 2013 IEEE 29th International Conference on Data Engineering (ICDE).

[7]  Cristian Ungureanu,et al.  TBF: A memory-efficient replacement policy for flash-based caches , 2013, 2013 IEEE 29th International Conference on Data Engineering (ICDE).

[8]  William Pugh,et al.  Skip lists: a probabilistic alternative to balanced trees , 1989, CACM.

[9]  Beng Chin Ooi,et al.  TI: an efficient indexing mechanism for real-time search on tweets , 2011, SIGMOD '11.

[10]  Beng Chin Ooi,et al.  Efficient B-tree based indexing for cloud data processing , 2010, Proc. VLDB Endow..