Optimizing R-tree for flash memory

We proposed a flash-optimized unbalanced R-tree index for flash memory.We introduced overflow nodes to defer node-splitting operations on the index.We presented a new buffering scheme to cache the node updates to the index.We conducted experiments on both real solid state drives and a flash simulation framework. R-tree has been widely used in spatial data management and data analysis to improve the performance of spatial data retrieval. However, the original R-tree is designed for magnetic disks, and has poor performance on flash memory, due to the special features of flash memory such as asymmetric read/write speeds (fast read, slow write) and the erase-before-write feature. Particularly, the original updating mechanism of R-tree usually has to update a few interior nodes when inserting an indexing item into or deleting an item from a leaf node, yielding many slow writes to flash memory. With the wide use of flash memory in many location-based fields, e.g., to store moving trajectories in intelligent transportation systems, how to optimize R-tree for flash memory has become a critical issue. In this paper, we propose a novel spatial index named Flash-Optimized R-tree that is optimized for flash memory. In particular, we propose to defer the node-splitting operations on R-tree by introducing overflow nodes, which results in an unbalanced tree structure. With this mechanism, we can reduce random writes to flash memory and improve the overall performance of R-tree. In addition, we present a new buffering scheme to efficiently cache the updates to the tree, which can further reduce random writes to flash memory. We conduct extensive experiments on real flash-memory storage devices as well as a flash memory simulation platform to evaluate the performance of our proposal, and the results suggest the efficiency of our proposal with respect to different metrics.

[1]  Sanjay Chawla,et al.  SLOM: a new measure for local spatial outliers , 2006, Knowledge and Information Systems.

[2]  Stratis Viglas,et al.  Spatial Data Management over Flash Memory , 2011, SSTD.

[3]  Suman Nath,et al.  Generic and efficient framework for search trees on flash memory storage systems , 2013, GeoInformatica.

[4]  Peiquan Jin,et al.  CCF-LRU: a new buffer replacement algorithm for flash memory , 2009, IEEE Transactions on Consumer Electronics.

[5]  Yannis Manolopoulos,et al.  Fringe Analysis of 2-3 Trees with Lazy Parent Split , 2000, Comput. J..

[6]  Peiquan Jin,et al.  Efficient Buffer Management for Tree Indexes on Solid State Drives , 2014, International Journal of Parallel Programming.

[7]  Bingsheng He,et al.  Tree Indexing on Flash Disks , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[8]  Jing Li,et al.  Log-Compact R-Tree: An Efficient Spatial Index for SSD , 2011, DASFAA Workshops.

[9]  Yong Wang,et al.  SDF: software-defined flash for web-scale internet storage systems , 2014, ASPLOS.

[10]  Moonhaeng Huh,et al.  An index management using CHC-cluster for flash memory databases , 2009, J. Syst. Softw..

[11]  Peiquan Jin,et al.  CFDC: a flash-aware replacement policy for database buffer management , 2009, DaMoN '09.

[12]  Jie Zhao,et al.  Exploiting location information for Web search , 2014, Comput. Hum. Behav..

[13]  Bingsheng He,et al.  Tree indexing on solid state drives , 2010, Proc. VLDB Endow..

[14]  Heeseung Jo,et al.  A superblock-based flash translation layer for NAND flash memory , 2006, EMSOFT '06.

[15]  B. Srinivasan,et al.  An Adaptive Overflow Technique to Defer Splitting in B-Trees , 1991, Computer/law journal.

[16]  Hans-Peter Kriegel,et al.  The R*-tree: an efficient and robust access method for points and rectangles , 1990, SIGMOD '90.

[17]  Tei-Wei Kuo,et al.  Efficient management for large-scale flash-memory storage systems with resource conservation , 2005, TOS.

[18]  Tei-Wei Kuo,et al.  An Efficient B-Tree Layer for Flash-Memory Storage Systems , 2003, RTCSA.

[19]  Peiquan Jin,et al.  AD-LRU: An efficient buffer replacement algorithm for flash-based databases , 2012, Data Knowl. Eng..

[20]  Peiquan Jin,et al.  Optimizing B+-tree for hybrid storage systems , 2014, Distributed and Parallel Databases.

[21]  Sivan Toledo,et al.  Algorithms and data structures for flash memories , 2005, CSUR.

[22]  Kwangjin Park,et al.  Location-based grid-index for spatial query processing , 2014, Expert Syst. Appl..

[23]  Panayiotis Bozanis,et al.  LR-tree: a Logarithmic Decomposable Spatial Index Method , 2003, Comput. J..

[24]  Ramesh K. Sitaraman,et al.  Lazy-Adaptive Tree: An Optimized Index Structure for Flash Devices , 2009, Proc. VLDB Endow..

[25]  Tei-Wei Kuo,et al.  An Adaptive Two-Level Management for the Flash Translation Layer in Embedded Systems , 2006, 2006 IEEE/ACM International Conference on Computer Aided Design.

[26]  Jin-Soo Kim,et al.  mu-tree: an ordered index structure for NAND flash memory , 2007, EMSOFT.

[27]  Lihua Yue,et al.  A reliable B-tree implementation over flash memory , 2008, SAC '08.

[28]  Goetz Graefe,et al.  Write-Optimized B-Trees , 2004, VLDB.

[29]  Gang Chen,et al.  On efficient reverse skyline query processing , 2014, Expert Syst. Appl..

[30]  Peiquan Jin,et al.  A flexible simulation environment for flash-aware algorithms , 2009, CIKM.

[31]  Tei-Wei Kuo,et al.  An efficient R-tree implementation over flash-memory storage systems , 2003, GIS '03.

[32]  Timos K. Sellis,et al.  A model for the prediction of R-tree performance , 1996, PODS.

[33]  Philippe Bonnet,et al.  uFLIP: Understanding Flash IO Patterns , 2009, CIDR.

[34]  Suman Nath,et al.  FAST: A Generic Framework for Flash-Aware Spatial Trees , 2011, SSTD.

[35]  Marios Hadjieleftheriou,et al.  R-Trees - A Dynamic Index Structure for Spatial Searching , 2008, ACM SIGSPATIAL International Workshop on Advances in Geographic Information Systems.