B+-tree Index Optimization by Exploiting Internal Parallelism of Flash-based Solid State Drives

Previous research addressed the potential problems of the hard-disk oriented design of DBMSs of flashSSDs. In this paper, we focus on exploiting potential benefits of flashSSDs. First, we examine the internal parallelism issues of flashSSDs by conducting benchmarks to various flashSSDs. Then, we suggest algorithm-design principles in order to best benefit from the internal parallelism. We present a new I/O request concept, called psync I/O that can exploit the internal parallelism of flashSSDs in a single process. Based on these ideas, we introduce B+-tree optimization methods in order to utilize internal parallelism. By integrating the results of these methods, we present a B+-tree variant, PIO B-tree. We confirmed that each optimization method substantially enhances the index performance. Consequently, PIO B-tree enhanced B+-tree's insert performance by a factor of up to 16.3, while improving point-search performance by a factor of 1.2. The range search of PIO B-tree was up to 5 times faster than that of the B+-tree. Moreover, PIO B-tree outperformed other flash-aware indexes in various synthetic workloads. We also confirmed that PIO B-tree outperforms B+-tree in index traces collected inside the Postgresql DBMS with TPC-C benchmark.

[1]  Xiaodong Zhang,et al.  Understanding intrinsic characteristics and system implications of flash memory based solid state drives , 2009, SIGMETRICS '09.

[2]  Sang-Won Lee,et al.  Design of flash-based DBMS: an in-page logging approach , 2007, SIGMOD '07.

[3]  Bingsheng He,et al.  Tree indexing on solid state drives , 2010, Proc. VLDB Endow..

[4]  S. B. Yao,et al.  Efficient locking for concurrent operations on B-trees , 1981, TODS.

[5]  Hamid Pirahesh,et al.  ARIES: a transaction recovery method supporting fine-granularity locking and partial rollbacks using write-ahead logging , 1998 .

[6]  Rina Panigrahy,et al.  Design Tradeoffs for SSD Performance , 2008, USENIX ATC.

[7]  Joonwon Lee,et al.  Exploiting Internal Parallelism of Flash-based SSDs , 2010, IEEE Computer Architecture Letters.

[8]  Sang-Won Lee,et al.  Advances in flash memory SSD technology for enterprise database applications , 2009, SIGMOD Conference.

[9]  Xiaodong Zhang,et al.  Essential roles of exploiting internal parallelism of flash memory based solid state drives in high-speed data processing , 2011, 2011 IEEE 17th International Symposium on High Performance Computer Architecture.

[10]  Jin-Soo Kim,et al.  A methodology for extracting performance parameters in solid state disks (SSDs) , 2009, 2009 IEEE International Symposium on Modeling, Analysis & Simulation of Computer and Telecommunication Systems.

[11]  John Paul Shen,et al.  Scaling and characterizing database workloads: bridging the gap between research and practice , 2003, Proceedings. 36th Annual IEEE/ACM International Symposium on Microarchitecture, 2003. MICRO-36..

[12]  Tei-Wei Kuo,et al.  An efficient B-tree layer implementation for flash-memory storage systems , 2007, TECS.

[13]  Hong Jiang,et al.  Performance impact and interplay of SSD parallelism through advanced commands, allocation strategy and data granularity , 2011, ICS '11.

[14]  Sang-Won Lee,et al.  Dynamic in-page logging for flash-aware B-tree index , 2009, CIKM.

[15]  Goetz Graefe,et al.  The Five-Minute Rule 20 Years Later: and How Flash Memory Changes the Rules , 2008, ACM Queue.

[16]  Il-Yeol Song,et al.  Page-differential logging: an efficient and DBMS-independent approach for storing data into flash memory , 2010, SIGMOD Conference.