Towards the design of efficient hash-based indexing scheme for growing databases on non-volatile memory

Abstract The index is a fundamental component in data intensive systems to accelerate data retrieval operations. In the design of Non-Volatile Memory (NVM) based indexes, the hash-based structure is one of the most promising candidates since it can take full advantages of byte-addressable property of NVM to perform query operations with constant time complexity. However, we found that the basic operation, “rehash operation”, may incur a large number of write activities on NVM, which is harmful to the endurance of NVM, and will cause drastic performance degradation. Additionally, range query operations cannot be efficiently conducted on hash-based indexes. In this paper, we first investigate how to design an NVM-friendly hash-based structure with the considerations of endurance and performance issues. Then, we propose a novel indexing scheme called “Bucket Hash”, which can significantly reduce the overhead caused by rehash operations and range query operations. We evaluate the proposed Bucket Hash using YCSB workloads. Compared with existing indexes, Bucket Hash achieves 40% reduction on average in the number of NVM writes, meanwhile gaining 30% improvement on timing performance.

[1]  Boris G. Pittel,et al.  Linear Probing: The Probable Largest Search Time Grows Logarithmically with the Number of Records , 1987, J. Algorithms.

[2]  K. Gopalakrishnan,et al.  Phase change memory technology , 2010, 1001.1164.

[3]  Vijayalakshmi Srinivasan,et al.  Scalable high performance main memory system using phase-change memory technology , 2009, ISCA '09.

[4]  Kenneth A. Ross,et al.  Making B+-Trees Cache Conscious in Main Memory , 2000, SIGMOD Conference.

[5]  Christopher Frost,et al.  Better I/O through byte-addressable, persistent memory , 2009, SOSP '09.

[6]  Rasmus Pagh,et al.  Cuckoo Hashing , 2001, Encyclopedia of Algorithms.

[7]  Onur Mutlu,et al.  Architecting phase change memory as a scalable dram alternative , 2009, ISCA '09.

[8]  Mi-Yen Yeh,et al.  An Adaptive Endurance-Aware ${B^+}$ -Tree for Flash Memory Storage Systems , 2014, IEEE Transactions on Computers.

[9]  Roy H. Campbell,et al.  Consistent and Durable Data Structures for Non-Volatile Byte-Addressable Memory , 2011, FAST.

[10]  Jignesh M. Patel,et al.  Effect of node size on the performance of cache-conscious B+-trees , 2003, SIGMETRICS '03.

[11]  Scott Shenker,et al.  Spark: Cluster Computing with Working Sets , 2010, HotCloud.

[12]  Subramanya Dulloor,et al.  Let's Talk About Storage & Recovery Methods for Non-Volatile Memory Database Systems , 2015, SIGMOD Conference.

[13]  Rajesh K. Gupta,et al.  NV-Heaps: making persistent objects fast and safe with next-generation, non-volatile memories , 2011, ASPLOS XVI.

[14]  Xueti Tang,et al.  Spin-transfer torque magnetic random access memory (STT-MRAM) , 2013, JETC.

[15]  Qin Jin,et al.  Persistent B+-Trees in Non-Volatile Main Memory , 2015, Proc. VLDB Endow..

[16]  Dongil Park,et al.  Resolving journaling of journal anomaly in android I/O: multi-version B-tree with lazy split , 2014, FAST.

[17]  Jun Yang,et al.  A durable and energy efficient main memory using phase change memory technology , 2009, ISCA '09.

[18]  Yifeng Zhu,et al.  Accelerating write by exploiting PCM asymmetries , 2013, 2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA).

[19]  Ismail Oukid,et al.  FPTree: A Hybrid SCM-DRAM Persistent and Concurrent B-Tree for Storage Class Memory , 2016, SIGMOD Conference.

[20]  R. Stanley Williams,et al.  Memristive devices in computing system: Promises and challenges , 2013, JETC.

[21]  Bingsheng He,et al.  NV-Tree: Reducing Consistency Cost for NVM-based Single Level Systems , 2015, FAST.

[22]  Vijayalakshmi Srinivasan,et al.  Enhancing lifetime and security of PCM-based Main Memory with Start-Gap Wear Leveling , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).