Failure-Atomic Byte-Addressable R-tree for Persistent Memory

In this article, we propose Failure-atomic Byte-addressable R-tree (FBR-tree) that leverages the byte-addressability, persistence, and high performance of persistent memory while guaranteeing the crash consistency. We carefully control the order of store and cacheline flush instructions and prevent any single store instruction from making an FBR-tree inconsistent and unrecoverable. We also develop a non-blocking lock-free range query algorithm for FBR-tree. Since FBR-tree allows read transactions to detect and ignore any transient inconsistent states, multiple read transactions can concurrently access tree nodes without using shared locks while other write transactions are making changes to them. Our performance study shows that FBR-tree successfully reduces the legacy logging overhead and the lock-free range query algorithm shows up to 2.6x higher query processing throughput than the shared lock-based crabbing concurrency protocol.

[1]  Michael Goldfarb,et al.  General transformations for GPU execution of tree traversals , 2013, 2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[2]  Yiming Huai,et al.  Spin-Transfer Torque MRAM (STT-MRAM): Challenges and Prospects , 2008 .

[3]  Karsten Schwan,et al.  NVRAM-aware Logging in Transaction Systems , 2014, Proc. VLDB Endow..

[4]  Craig A. N. Soules,et al.  Metadata Efficiency in Versioning File Systems , 2003, FAST.

[5]  Roy H. Campbell,et al.  Consistent and Durable Data Structures for Non-Volatile Byte-Addressable Memory , 2011, FAST.

[6]  Alok N. Choudhary,et al.  High Performance Multidimensional Analysis and Data Mining , 1998, Proceedings of the IEEE/ACM SC98 Conference.

[7]  Petko Bakalov,et al.  On-line discovery of flock patterns in spatio-temporal data , 2009, GIS.

[8]  Eunji Lee,et al.  Unioning of the buffer cache and journaling layers with non-volatile memory , 2013, FAST.

[9]  Steven Swanson,et al.  An Empirical Guide to the Behavior and Use of Scalable Persistent Memory , 2019, FAST.

[10]  Jian Xu,et al.  NOVA: A Log-structured File System for Hybrid Volatile/Non-volatile Main Memories , 2016, FAST.

[11]  Yuan Xie,et al.  Kiln: Closing the performance gap between systems with and without persistence support , 2013, 2013 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[12]  V. Pascucci,et al.  Global Static Indexing for Real-Time Exploration of Very Large Regular Grids , 2001, ACM/IEEE SC 2001 Conference (SC'01).

[13]  Kimberly Keeton,et al.  Memory-Driven Computing , 2017, FAST.

[14]  Youjip Won,et al.  Endurable Transient Inconsistency in Byte-Addressable Persistent B+-Tree , 2018, FAST.

[15]  Thomas Heinis,et al.  Accelerating Range Queries for Brain Simulations , 2012, 2012 IEEE 28th International Conference on Data Engineering.

[16]  Ismail Oukid,et al.  FPTree: A Hybrid SCM-DRAM Persistent and Concurrent B-Tree for Storage Class Memory , 2016, SIGMOD Conference.

[17]  Samuel Madden,et al.  TrajStore: An adaptive storage system for very large trajectory data sets , 2010, 2010 IEEE 26th International Conference on Data Engineering (ICDE 2010).

[18]  Dmitriy Morozov,et al.  Efficient Delaunay Tessellation through K-D Tree Decomposition , 2016, SC16: International Conference for High Performance Computing, Networking, Storage and Analysis.

[19]  Joel H. Saltz,et al.  A simulation and data analysis system for large‐scale, data‐driven oil reservoir simulation studies , 2005, Concurr. Pract. Exp..

[20]  Myoungsoo Jung,et al.  Area, Power, and Latency Considerations of STT-MRAM to Substitute for Main Memory , 2014 .

[21]  Sam H. Noh,et al.  WORT: Write Optimal Radix Tree for Persistent Memory Storage Systems , 2017, FAST.

[22]  Michael S. Warren 2HOT: An improved parallel hashed oct-tree N-Body algorithm for cosmological simulation , 2013, 2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[23]  Abraham Silberschatz,et al.  Database Systems Concepts , 1997 .

[24]  Yu Hua A Write-friendly Hashing Scheme for Non-volatile Memory Systems , 2017 .

[25]  Subramanya Dulloor,et al.  Let's Talk About Storage & Recovery Methods for Non-Volatile Memory Database Systems , 2015, SIGMOD Conference.

[26]  Henri Casanova,et al.  Indexing of Spatiotemporal Trajectories for Efficient Distance Threshold Similarity Searches on the GPU , 2015, 2015 IEEE International Parallel and Distributed Processing Symposium.

[27]  T. Kurc,et al.  Efficient Execution of Multiple Query Workloads in Data Analysis Applications , 2001, ACM/IEEE SC 2001 Conference (SC'01).

[28]  Christopher Frost,et al.  Better I/O through byte-addressable, persistent memory , 2009, SOSP '09.

[29]  Thomas F. Wenisch,et al.  High-Performance Transactions for Persistent Memories , 2016, ASPLOS.

[30]  Andy Rudoff Programming Models for Emerging Non-Volatile Memory Technologies , 2013, login Usenix Mag..

[31]  C. Mohan,et al.  High performance database logging using storage class memory , 2011, 2011 IEEE 27th International Conference on Data Engineering.

[32]  Jun Li,et al.  Quartz: A Lightweight Performance Emulator for Persistent Memory Software , 2015, Middleware.

[33]  Bingsheng He,et al.  NV-Tree: Reducing Consistency Cost for NVM-based Single Level Systems , 2015, FAST.

[34]  Dino Pedreschi,et al.  Trajectory pattern mining , 2007, KDD '07.

[35]  Sam H. Noh,et al.  Failure-Atomic Slotted Paging for Persistent Memory , 2017, ASPLOS.

[36]  D. M. Hutton,et al.  The Art of Multiprocessor Programming , 2008 .

[37]  Alan Sussman,et al.  A comparative study of spatial indexing techniques for multidimensional scientific datasets , 2004, Proceedings. 16th International Conference on Scientific and Statistical Database Management, 2004..

[38]  Jie Wu,et al.  Write-Optimized and High-Performance Hashing Index Scheme for Persistent Memory , 2018, OSDI.

[39]  Magdalena Balazinska,et al.  Scalable Clustering Algorithm for N-Body Simulations in a Shared-Nothing Cluster , 2010, SSDBM.

[40]  Antonin Guttman,et al.  R-trees: a dynamic index structure for spatial searching , 1984, SIGMOD '84.

[41]  Christian S. Jensen,et al.  Discovery of convoys in trajectory databases , 2008, Proc. VLDB Endow..

[42]  Qin Jin,et al.  Persistent B+-Trees in Non-Volatile Main Memory , 2015, Proc. VLDB Endow..

[43]  Joel H. Saltz,et al.  Spatio-temporal Analysis for New York State SPARCS Data , 2017, CRI.

[44]  Terence Kelly,et al.  Failure-Atomic Persistent Memory Updates via JUSTDO Logging , 2016, ASPLOS.

[45]  Youjip Won,et al.  NVWAL: Exploiting NVRAM in Write-Ahead Logging , 2016, ASPLOS.

[46]  Michael M. Swift,et al.  Mnemosyne: lightweight persistent memory , 2011, ASPLOS XVI.

[47]  A. Sussman,et al.  Multiple Range Query Optimization with Distributed Cache Indexing , 2006, ACM/IEEE SC 2006 Conference (SC'06).

[48]  Peter Xiang Gao,et al.  Quartz , 2021, Encyclopedic Dictionary of Archaeology.

[49]  Youyou Lu,et al.  A high performance file system for non-volatile main memory , 2016, EuroSys.

[50]  Steven Swanson,et al.  A study of application performance with non-volatile main memory , 2015, 2015 31st Symposium on Mass Storage Systems and Technologies (MSST).

[51]  T. Kurc,et al.  Querying Very Large Multi-dimensional Datasets in ADR , 1999, ACM/IEEE SC 1999 Conference (SC'99).

[52]  M. Breitwisch Phase Change Memory , 2008, 2008 International Interconnect Technology Conference.

[53]  Sang-Won Lee,et al.  SQLite Optimization with Phase Change Memory for Mobile Applications , 2015, Proc. VLDB Endow..

[54]  Sam H. Noh,et al.  Write-Optimized Dynamic Hashing for Persistent Memory , 2019, FAST.

[55]  Maurice Herlihy,et al.  Wait-free synchronization , 1991, TOPL.

[56]  Kaladhar Voruganti,et al.  An empirical study of file systems on NVM , 2015, 2015 31st Symposium on Mass Storage Systems and Technologies (MSST).

[57]  Jae-Gil Lee,et al.  MoveMine: Mining moving object data for discovery of animal movement patterns , 2011, TIST.