LearnedCache: A Locality-Aware Collaborative Data Caching by Learning Model

High-efficiency in-memory caching databases are the key to building large-scale Internet services. Well-designed cache can reduce the pressure on network servers and the response delay of applications. The popular memory caching system Memcached and Redis have been successfully deployed by the famous Internet enterprises, such as Amazon, GitHub and Sina. The state-of-the-art solutions mainly focus on optimizing the performance by providing a suit of static caching policies for various applications. Take the popular key-value data structure caching store Redis for example, it provides LRU, LFU and random strategies. The current caching approaches are based on either access frequency or access timestamp, not considering data locality. However, data locality has been shown to have significant impacts on big data workloads, which are poorly served by current static caching approaches. In this paper, we present LearnedCache, a highly efficient inmemory caching algorithm. It significantly outperforms various replacement policies of Redis and Memcached for a variety of workloads. LearnedCache accomplishes this by leveraging a locality-aware learning model that provides the hotspot map of hot data based on both access frequency and access timestamp. Furthermore, to make the LearnedCache design light and highly efficient, LearnedCache adopts a collaborative technology: TinyLFU filters the cold request data into cache; lazily trains and learns the model for reducing the overhead of main worker; learned-index data structure for optimizing the data well balanced. Our methods will be useful for all distributed Web, file system, database and content delivery services. Compared with the various caching policies of two popular in-memory KV stores(i.e., Redis and Memcached), experimental results show that LearnedCache outperforms LRU and LFU by 8.7% and 12.6% on average (up to 13.4% and 16.5%), respectively.

[1]  Gerhard Weikum,et al.  The LRU-K page replacement algorithm for database disk buffering , 1993, SIGMOD Conference.

[2]  Goetz Graefe,et al.  The five-minute rule twenty years later, and how flash memory changes the rules , 2007, DaMoN '07.

[3]  Roy Friedman,et al.  TinyLFU: A Highly Efficient Cache Admission Policy , 2014, PDP.

[4]  Vahid Torkzaban,et al.  SCRAME: selection of cache replacement algorithm based on multi expert , 2009, iiWAS.

[5]  Tim Kraska,et al.  The Case for Learned Index Structures , 2018 .

[6]  Deniz Gündüz,et al.  Learning-based optimization of cache content in a small cell base station , 2014, 2014 IEEE International Conference on Communications (ICC).

[7]  Adam Silberstein,et al.  Benchmarking cloud serving systems with YCSB , 2010, SoCC '10.

[8]  Graham Cormode,et al.  An improved data stream summary: the count-min sketch and its applications , 2004, J. Algorithms.

[9]  Carey L. Williamson,et al.  Estimating Instantaneous Cache Hit Ratio Using Markov Chain Analysis , 2013, IEEE/ACM Transactions on Networking.

[10]  Scott A. Brandt,et al.  ACME: Adaptive Caching Using Multiple Experts , 2002, WDAS.

[11]  Bin Fan,et al.  MemC3: Compact and Concurrent MemCache with Dumber Caching and Smarter Hashing , 2013, NSDI.

[12]  C. Bauckhage k-Means Clustering Is Matrix Factorization , 2015, 1512.07548.

[13]  Tony Tung,et al.  Scaling Memcache at Facebook , 2013, NSDI.

[14]  Eddie Kohler,et al.  Cache craftiness for fast multicore key-value storage , 2012, EuroSys '12.

[15]  Li Zhang,et al.  zExpander: a key-value cache with both high performance and fewer misses , 2016, EuroSys.