Self-Adaptive Linear Hashing for solid state drives

Flash memory based solid state drives (SSDs) have emerged as a new alternative to replace magnetic disks due to their high performance and low power consumption. However, random writes on SSDs are much slower than SSD reads. Therefore, traditional index structures, which are designed based on the symmetrical I/O property of magnetic disks, cannot completely exert the high performance of SSDs. In this paper, we propose an SSD-optimized linear hashing index called Self-Adaptive Linear Hashing (SAL-Hashing) to reduce small random writes to SSDs that are caused by index operations. The contributions of our work are manifold. First, we propose to organize buckets into groups and sets to facilitate coarse-grained writes and lazy-split so as to avoid intermediate writes on the hash structure. A group consists of a fixed number of buckets and a set consists of a number of groups. Second, we attach a log region to each set, and amortize the cost of reads and writes by committing updates to the log region in batch. Third, in order to reduce search cost, each log region is equipped with Bloom filters to index update logs. We devise a cost-based online algorithm to adaptively merge the log region with the corresponding set when the set becomes search-intensive. Finally, in order to exploit the internal package-level parallelisms of SSDs, we apply coarse-grained writes for merging or split operations to achieve a high bandwidth. Our experimental results suggest that our proposal is self-adaptive according to the change of access patterns, and outperforms several competitors under various workloads on two commodity SSDs.

[1]  Suman Nath,et al.  FAST: A Generic Framework for Flash-Aware Spatial Trees , 2011, SSTD.

[2]  Anna R. Karlin,et al.  Competitive snoopy caching , 1986, 27th Annual Symposium on Foundations of Computer Science (sfcs 1986).

[3]  Burton H. Bloom,et al.  Space/time trade-offs in hash coding with allowable errors , 1970, CACM.

[4]  Myoung-Ho Kim,et al.  An Efficient Dynamic Hash Index Structure for NAND Flash Memory , 2009, IEICE Trans. Fundam. Electron. Commun. Comput. Sci..

[5]  David Thomas,et al.  The Art in Computer Programming , 2001 .

[6]  Xiang Li,et al.  A New Dynamic Hash Index for Flash-Based Storage , 2008, 2008 The Ninth International Conference on Web-Age Information Management.

[7]  Jennifer Widom,et al.  Database System Implementation , 2000 .

[8]  Xiaodong Zhang,et al.  Essential roles of exploiting internal parallelism of flash memory based solid state drives in high-speed data processing , 2011, 2011 IEEE 17th International Symposium on High Performance Computer Architecture.

[9]  Patrick E. O'Neil,et al.  The log-structured merge-tree (LSM-tree) , 1996, Acta Informatica.

[10]  Sang-Won Lee,et al.  Design of flash-based DBMS: an in-page logging approach , 2007, SIGMOD '07.

[11]  Sang-Won Lee,et al.  h-Hash: A Hash Index Structure for Flash-Based Solid State Drives , 2015, J. Circuits Syst. Comput..

[12]  Steven S. Seiden,et al.  A guessing game and randomized online algorithms , 2000, STOC '00.

[13]  Witold Litwin,et al.  Linear Hashing: A new Algorithm for Files and Tables Addressing , 1980, ICOD.

[14]  Anna R. Karlin,et al.  Competitive randomized algorithms for non-uniform problems , 1990, SODA '90.

[15]  Sang-Won Lee,et al.  B+-tree Index Optimization by Exploiting Internal Parallelism of Flash-based Solid State Drives , 2011, Proc. VLDB Endow..

[16]  Li Wang,et al.  A new self-adaptive extendible hash index for flash-based DBMS , 2010, The 2010 IEEE International Conference on Information and Automation.

[17]  Kenneth A. Ross,et al.  Buffered Bloom Filters on Solid State Storage , 2010, ADMS@VLDB.

[18]  Dong-Ho Lee,et al.  Hybrid hash index for NAND flash memory-based storage systems , 2012, ICUIMC '12.

[19]  Tei-Wei Kuo,et al.  An efficient B-tree layer implementation for flash-memory storage systems , 2007, TECS.

[20]  Song Jiang,et al.  LIRS: an efficient low inter-reference recency set replacement policy to improve buffer cache performance , 2002, SIGMETRICS '02.

[21]  Dimitrios Gunopulos,et al.  Microhash: an efficient index structure for fash-based sensor devices , 2005, FAST'05.

[22]  Andrei Broder,et al.  Network Applications of Bloom Filters: A Survey , 2004, Internet Math..

[23]  Suman Nath,et al.  FlashDB: Dynamic Self-tuning Database for NAND Flash , 2007, 2007 6th International Symposium on Information Processing in Sensor Networks.

[24]  Bingsheng He,et al.  Tree indexing on solid state drives , 2010, Proc. VLDB Endow..