Leaper: A Learned Prefetcher for Cache Invalidation in LSM-tree based Storage Engines

Frequency-based cache replacement policies that work well on page-based database storage engines are no longer sufficient for the emerging LSM-tree (Log-Structure Merge-tree) based storage engines. Due to the append-only and copyon-write techniques applied to accelerate writes, the stateof-the-art LSM-tree adopts mutable record blocks and issues frequent background operations (i.e., compaction, flush) to reorganize records in possibly every block. As a side-effect, such operations invalidate the corresponding entries in the cache for each involved record, causing sudden drops on the cache hit rates and spikes on access latency. Given the observation that existing methods cannot address this cache invalidation problem, we propose Leaper, a machine learning method to predict hot records in an LSM-tree storage engine and prefetch them into the cache without being disturbed by background operations. We implement Leaper in a state-of-the-art LSM-tree storage engine, X-Engine, as a light-weight plug-in. Evaluation results show that Leaper eliminates about 70% cache invalidations and 99% latency spikes with at most 0.95% overheads as measured in realworld workloads. PVLDB Reference Format: Lei Yang, Hong Wu, Tieying Zhang, Xuntao Cheng, Feifei Li, Lei Zou, Yujie Wang, Rongyao Chen, Jianying Wang, and Gui Huang. Leaper: A Learned Prefetcher for Cache Invalidation in LSM-tree based Storage Engines. PVLDB, 13(11): 1976-1989, 2020. DOI: https://doi.org/10.14778/3407790.3407803

[1]  Bettina Kemme,et al.  Compaction Management in Distributed Key-Value Datastores , 2015, Proc. VLDB Endow..

[2]  Christian Berthet Approximation of LRU Caches Miss Rate: Application to Power-law Popularities , 2017, ArXiv.

[3]  Tim Kraska,et al.  The Case for Learned Index Structures , 2018 .

[4]  Jeremy Ellman,et al.  Performance Testing and Comparison of Client Side Databases Versus Server Side , 2013 .

[5]  Andrew Pavlo,et al.  Scheduling OLTP transactions via learned abort prediction , 2019, aiDM@SIGMOD.

[6]  Christoforos E. Kozyrakis,et al.  Learning Memory Access Patterns , 2018, ICML.

[7]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[8]  Guoliang Li,et al.  QTune: A Query-Aware Database Tuning System with Deep Reinforcement Learning , 2019, Proc. VLDB Endow..

[9]  Burton H. Bloom,et al.  Space/time trade-offs in hash coding with allowable errors , 1970, CACM.

[10]  Ke Zhou,et al.  An End-to-End Automatic Cloud Database Tuning System Using Deep Reinforcement Learning , 2019, SIGMOD Conference.

[11]  Olga Papaemmanouil,et al.  Deep Reinforcement Learning for Join Order Enumeration , 2018, aiDM@SIGMOD.

[12]  William Pugh,et al.  Skip Lists: A Probabilistic Alternative to Balanced Trees , 1989, WADS.

[13]  Manos Athanassoulis,et al.  Lethe: A Tunable Delete-Aware LSM Engine , 2020, SIGMOD Conference.

[14]  Christopher J. C. Burges,et al.  From RankNet to LambdaRank to LambdaMART: An Overview , 2010 .

[15]  Raghu Ramakrishnan,et al.  bLSM: a general purpose log structured merge tree , 2012, SIGMOD Conference.

[16]  Matthew Richardson,et al.  Predicting clicks: estimating the click-through rate for new ads , 2007, WWW '07.

[17]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[18]  Tim Kraska,et al.  Neo: A Learned Query Optimizer , 2019, Proc. VLDB Endow..

[19]  R. Real,et al.  AUC: a misleading measure of the performance of predictive distribution models , 2008 .

[20]  Tim Kraska,et al.  SageDB: A Learned Database System , 2019, CIDR.

[21]  Wei Cao,et al.  X-Engine: An Optimized Storage Engine for Large-scale E-commerce Transaction Processing , 2019, SIGMOD Conference.

[22]  Carlo Curino,et al.  DBSeer: Resource and Performance Prediction for Building a Next Generation Database Cloud , 2013, CIDR.

[23]  Lei Guo,et al.  Re-enabling high-speed caching for LSM-trees , 2016, ArXiv.

[24]  Lin Ma,et al.  Query-based Workload Forecasting for Self-Driving Database Management Systems , 2018, SIGMOD Conference.

[25]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[26]  Lin Ma,et al.  Self-Driving Database Management Systems , 2017, CIDR.

[27]  Tie-Yan Liu,et al.  LightGBM: A Highly Efficient Gradient Boosting Decision Tree , 2017, NIPS.

[28]  Geoffrey J. Gordon,et al.  Automatic Database Management System Tuning Through Large-scale Machine Learning , 2017, SIGMOD Conference.

[29]  Gerhard Weikum,et al.  The LRU-K page replacement algorithm for database disk buffering , 1993, SIGMOD Conference.

[30]  Douglas C. Schmidt,et al.  Double-checked locking , 1997 .

[31]  S. Sudarshan,et al.  Incremental Organization for Data Recording and Warehousing , 1997, VLDB.

[32]  J. T. Robinson,et al.  Data cache management using frequency-based replacement , 1990, SIGMETRICS '90.

[33]  Feifei Li,et al.  iBTune: Individualized Buffer Tuning for Large-scale Cloud Databases , 2019, Proc. VLDB Endow..

[34]  Lei Guo,et al.  LSbM-tree: Re-Enabling Buffer Caching in Data Management for Mixed Reads and Writes , 2017, 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS).

[35]  J. Woods,et al.  Probability and Random Processes with Applications to Signal Processing , 2001 .

[36]  Silvio Salza,et al.  Workload Modeling for Relational Database Systems , 1985, IWDM.