Query processing techniques for solid state drives

Solid state drives perform random reads more than 100x faster than traditional magnetic hard disks, while offering comparable sequential read and write bandwidth. Because of their potential to speed up applications, as well as their reduced power consumption, these new drives are expected to gradually replace hard disks as the primary permanent storage media in large data centers. However, although they may benefit applications that stress random reads immediately, they may not improve database applications, especially those running long data analysis queries. Database query processing engines have been designed around the speed mismatch between random and sequential I/O on hard disks and their algorithms currently emphasize sequential accesses for disk-resident data. In this paper, we investigate data structures and algorithms that leverage fast random reads to speed up selection, projection, and join operations in relational query processing. We first demonstrate how a column-based layout within each page reduces the amount of data read during selections and projections. We then introduce FlashJoin, a general pipelined join algorithm that minimizes accesses to base and intermediate relational data. FlashJoin's binary join kernel accesses only the join attributes, producing partial results in the form of a join index. Subsequently, its fetch kernel retrieves the attributes for later nodes in the query plan as they are needed. FlashJoin significantly reduces memory and I/O requirements for each join in the query. We implemented these techniques inside Postgres and experimented with an enterprise SSD drive. Our techniques improved query runtimes by up to 6x for queries ranging from simple relational scans and joins to full TPC-H queries.

[1]  Philippe Bonnet,et al.  uFLIP: Understanding Flash IO Patterns , 2009, CIDR.

[2]  Stratis Viglas,et al.  Flashing up the storage layer , 2008, Proc. VLDB Endow..

[3]  Leonard D. Shapiro,et al.  Join processing in database systems with large main memories , 1986, TODS.

[4]  Setrag Khoshafian,et al.  A decomposition storage model , 1985, SIGMOD Conference.

[5]  Milo Polte,et al.  Enabling Enterprise Solid State Disks Performance , 2009 .

[6]  Marcin Zukowski,et al.  MonetDB/X100: Hyper-Pipelining Query Execution , 2005, CIDR.

[7]  David J. DeWitt,et al.  Data page layouts for relational databases on deep memory hierarchies , 2002, The VLDB Journal.

[8]  Goetz Graefe,et al.  The five-minute rule twenty years later, and how flash memory changes the rules , 2007, DaMoN '07.

[9]  Bingsheng He,et al.  Tree Indexing on Flash Disks , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[10]  Daniel S. Myers,et al.  On the use of NAND flash memory in high-performance relational databases , 2008 .

[11]  Suman Nath,et al.  Online maintenance of very large random samples on flash storage , 2008, The VLDB Journal.

[12]  David J. DeWitt,et al.  Read-optimized databases, in depth , 2008, Proc. VLDB Endow..

[13]  Kenneth A. Ross,et al.  Fast joins using join indices , 1999, The VLDB Journal.

[14]  Sang-Won Lee,et al.  Design of flash-based DBMS: an in-page logging approach , 2007, SIGMOD '07.

[15]  Erhard Rahm,et al.  TID hash joins , 1994, CIKM '94.

[16]  Michael Stonebraker,et al.  C-Store: A Column-oriented DBMS , 2005, VLDB.

[17]  Kenneth A. Ross,et al.  Modeling the performance of algorithms on flash memory devices , 2008, DaMoN '08.

[18]  Alfons Kemper,et al.  Integrating semi-join-reducers into state-of-the-art query processors , 2001, Proceedings 17th International Conference on Data Engineering.

[19]  Parthasarathy Ranganathan,et al.  Energy Efficiency: The New Holy Grail of Data Management Systems Research , 2009, CIDR.

[20]  Michael Stonebraker,et al.  Implementation techniques for main memory database systems , 1984, SIGMOD '84.

[21]  Goetz Graefe,et al.  Query evaluation techniques for large databases , 1993, CSUR.

[22]  Daniel J. Abadi,et al.  Column-stores vs. row-stores: how different are they really? , 2008, SIGMOD Conference.

[23]  Goetz Graefe,et al.  Fast scans and joins using flash drives , 2008, DaMoN '08.

[24]  Daniel J. Abadi,et al.  Performance tradeoffs in read-optimized databases , 2006, VLDB.

[25]  David J. DeWitt,et al.  Materialization Strategies in a Column-Oriented DBMS , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[26]  Jae-Myung Kim,et al.  A case for flash memory ssd in enterprise database applications , 2008, SIGMOD Conference.

[27]  Goetz Graefe,et al.  The Five-Minute Rule 20 Years Later: and How Flash Memory Changes the Rules , 2008, ACM Queue.