Kangaroo: Caching Billions of Tiny Objects on Flash

Many social-media and IoT services have very large working sets consisting of billions of tiny (≈100 B) objects. Large, flash-based caches are important to serving these working sets at acceptable monetary cost. However, caching tiny objects on flash is challenging for two reasons: (i) SSDs can read/write data only in multi-KB "pages" that are much larger than a single object, stressing the limited number of times flash can be written; and (ii) very few bits per cached object can be kept in DRAM without losing flash's cost advantage. Unfortunately, existing flash-cache designs fall short of addressing these challenges: write-optimized designs require too much DRAM, and DRAM-optimized designs require too many flash writes. We present Kangaroo, a new flash-cache design that optimizes both DRAM usage and flash writes to maximize cache performance while minimizing cost. Kangaroo combines a large, set-associative cache with a small, log-structured cache. The set-associative cache requires minimal DRAM, while the log-structured cache minimizes Kangaroo's flash writes. Experiments using traces from Facebook and Twitter show that Kangaroo achieves DRAM usage close to the best prior DRAM-optimized design, flash writes close to the best prior write-optimized design, and miss ratios better than both. Kangaroo's design is Pareto-optimal across a range of allowed write rates, DRAM sizes, and flash sizes, reducing misses by 29% over the state of the art. These results are corroborated with a test deployment of Kangaroo in a production flash cache at Facebook.

[1]  Massimo Gallo,et al.  Performance evaluation of the random replacement policy for networks of caches , 2012, SIGMETRICS '12.

[2]  Predrag R. Jelenkovic,et al.  Performance of the move-to-front algorithm with Markov-modulated request sequences , 1999, Oper. Res. Lett..

[3]  Jin Li,et al.  SkimpyStash: RAM space skimpy key-value store on flash-based storage , 2011, SIGMOD '11.

[4]  Qiang Fu,et al.  Workload analysis and caching strategies for search advertising systems , 2017, SoCC.

[5]  Zili Shao,et al.  DIDACache: A Deep Integration of Device and Application for Flash Based Key-Value Caching , 2017, FAST.

[6]  Mor Harchol-Balter,et al.  The CacheLib Caching Engine: Design and Experiences at Scale , 2020, OSDI.

[7]  Bin Fan,et al.  MemC3: Compact and Concurrent MemCache with Dumber Caching and Smarter Hashing , 2013, NSDI.

[8]  Donald F. Towsley,et al.  Approximate Models for General Cache Networks , 2010, 2010 Proceedings IEEE INFOCOM.

[9]  Ryan Stutsman,et al.  Memshare: a Dynamic Multi-tenant Key-value Cache , 2017, USENIX Annual Technical Conference.

[10]  Song Jiang,et al.  LSM-trie: An LSM-tree-based Ultra-Large Key-Value Store for Small Data Items , 2015, USENIX Annual Technical Conference.

[11]  Ramesh K. Sitaraman,et al.  AdaptSize: Orchestrating the Hot Object Memory Cache in a Content Delivery Network , 2017, NSDI.

[12]  Irfan Ahmad,et al.  Cache Modeling and Optimization using Miniature Simulations , 2017, USENIX Annual Technical Conference.

[13]  Gary Davis 2020: Life with 50 billion connected devices , 2018, 2018 IEEE International Conference on Consumer Electronics (ICCE).

[14]  K. V. Rashmi,et al.  A Large-scale Analysis of Hundreds of In-memory Key-value Cache Clusters at Twitter , 2021, ACM Trans. Storage.

[15]  Brad Fitzpatrick,et al.  Distributed caching with memcached , 2004 .

[16]  Daniel Sánchez,et al.  Scaling distributed cache hierarchies through computation and data co-scheduling , 2015, 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA).

[17]  David A. Wood,et al.  A Comparison of Trace-Sampling Techniques for Multi-Megabyte Caches , 1994, IEEE Trans. Computers.

[18]  Antony I. T. Rowstron,et al.  Could cloud storage be disrupted in the next decade? , 2020, HotStorage.

[19]  Kai Li,et al.  Learning Relaxed Belady for Content Distribution Network Caching , 2020, NSDI.

[20]  Rui Wang,et al.  Using Set Cover to Optimize a Large-Scale Low Latency Distributed Graph , 2013, HotCloud.

[21]  Andrea C. Arpaci-Dusseau,et al.  WiscKey: Separating Keys from Values in SSD-conscious Storage , 2016, FAST.

[22]  Urs Niesen,et al.  Online Coded Caching , 2013, IEEE/ACM Transactions on Networking.

[23]  Ittai Abraham,et al.  PebblesDB: Building Key-Value Stores using Fragmented Log-Structured Merge Trees , 2017, SOSP.

[24]  Liuba Shrira,et al.  Opportunistic log: efficient installation reads in a reliable storage server , 1994, OSDI '94.

[25]  Philippe Robert,et al.  A versatile and accurate approximation for LRU cache performance , 2012, 2012 24th International Teletraffic Congress (ITC 24).

[26]  David A. Patterson,et al.  Computer Architecture: A Quantitative Approach , 1969 .

[27]  Mor Harchol-Balter,et al.  Practical Bounds on Optimal Caching with Variable Object Sizes , 2017, SIGMETRICS.

[28]  Amazon DynamoDB , 2019, Machine Learning in the AWS Cloud.

[29]  Rashmi Vinayak,et al.  Segcache: a memory-efficient and scalable in-memory key-value cache for small objects , 2021, NSDI.

[30]  Aamer Jaleel,et al.  High performance cache replacement using re-reference interval prediction (RRIP) , 2010, ISCA.

[31]  Sachin Kulkarni,et al.  Twine: A Unified Cluster Management System for Shared Infrastructure , 2020, OSDI.

[32]  Daniel Sánchez,et al.  Talus: A simple way to remove cliffs in cache performance , 2015, 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA).

[33]  Lars Backstrom,et al.  The Anatomy of the Facebook Social Graph , 2011, ArXiv.

[34]  Hui Ding,et al.  TAO: Facebook's Distributed Data Store for the Social Graph , 2013, USENIX Annual Technical Conference.

[35]  Qiang Fu,et al.  Better Caching in Search Advertising Systems with Rapid Refresh Predictions , 2018, WWW.

[36]  Bin Fan,et al.  SILT: a memory-efficient, high-performance key-value store , 2011, SOSP.

[37]  Mor Harchol-Balter,et al.  RobinHood: Tail Latency Aware Caching - Dynamic Reallocation from Cache-Rich to Cache-Poor , 2018, OSDI.

[38]  David A. Wood,et al.  Reuse-based online models for caches , 2013, SIGMETRICS '13.

[39]  Alfred V. Aho,et al.  Principles of Optimal Page Replacement , 1971, J. ACM.

[40]  K. V. Rashmi,et al.  A large scale analysis of hundreds of in-memory cache clusters at Twitter , 2020, OSDI.

[41]  Peter Desnoyers,et al.  Write Endurance in Flash Drives: Measurements and Analysis , 2010, FAST.

[42]  Joo Young Hwang,et al.  F2FS: A New File System for Flash Storage , 2015, FAST.

[43]  Cheng Li,et al.  Pannier: Design and Analysis of a Container-Based Flash Cache for Compound Objects , 2017, ACM Trans. Storage.

[44]  Badrish Chandramouli,et al.  FASTER: An Embedded Concurrent Key-Value Store for State Management , 2018, Proc. VLDB Endow..

[45]  Sachin Katti,et al.  Cliffhanger: Scaling Performance Cliffs in Web Memory Caches , 2016, NSDI.

[46]  Mendel Rosenblum,et al.  The design and implementation of a log-structured file system , 1991, SOSP '91.

[47]  Sachin Katti,et al.  Flashield: a Hybrid Key-value Cache that Controls Flash Write Amplification , 2019, NSDI.

[48]  Florin Ciucu,et al.  Exact analysis of TTL cache networks , 2014, Perform. Evaluation.

[49]  Daniel Sánchez,et al.  Whirlpool: Improving Dynamic Cache Management with Static Data Classification , 2016, ASPLOS.

[50]  Massimo Gallo,et al.  Performance evaluation of the random replacement policy for networks of caches , 2012, SIGMETRICS '12.

[51]  Zili Shao,et al.  Optimizing Flash-based Key-value Cache Systems , 2016, HotStorage.

[52]  Eddie Kohler,et al.  Modular data storage with Anvil , 2009, SOSP '09.

[53]  Kai Li,et al.  RIPQ: Advanced Photo Caching on Flash for Facebook , 2015, FAST.

[54]  Asit Dan,et al.  An approximate analysis of the LRU and FIFO buffer replacement schemes , 1990, SIGMETRICS '90.

[55]  Andrea C. Arpaci-Dusseau,et al.  The Unwritten Contract of Solid State Drives , 2017, EuroSys.

[56]  Michael J. Freedman,et al.  Hyperbolic Caching: Flexible Caching for Web Applications , 2017, USENIX ATC.

[57]  Sungjin Lee,et al.  BlueCache: A Scalable Distributed Flash-based Key-value Store , 2016, Proc. VLDB Endow..

[58]  Peter J. Denning,et al.  Operating Systems Theory , 1973 .

[59]  Wilson C. Hsieh,et al.  Bigtable: A Distributed Storage System for Structured Data , 2006, TOCS.

[60]  Michael A. Bender,et al.  Small Refinements to the DAM Can Have Big Consequences for Data-Structure Design , 2019, SPAA.