Pannier: Design and Analysis of a Container-Based Flash Cache for Compound Objects

Classic caching algorithms leverage recency, access count, and/or other properties of cached blocks at per-block granularity. However, for media such as flash which have performance and wear penalties for small overwrites, implementing cache policies at a larger granularity is beneficial. Recent research has focused on buffering small blocks and writing in large granularities, sometimes called containers, but it has not explored the ramifications and best strategies for caching compound blocks consisting of logically distinct, but physically co-located, blocks. Containers may have highly diverse blocks, with mixtures of frequently accessed, infrequently accessed, and invalidated blocks. We propose and evaluate Pannier, a flash cache layer that provides high performance while extending flash lifespan. Pannier uses three main techniques: (1) leveraging block access counts to manage cache containers, (2) incorporating block liveness as a property to improve flash cache space efficiency, and (3) designing a multi-step feedback controller to ensure a flash cache reaches its desired lifespan while maintaining performance. Our evaluation shows that Pannier improves flash cache performance and extends lifespan beyond previous per-block and container-aware caching policies. More fundamentally, our investigation highlights the importance of creating new policies for caching compound blocks in flash.

[1]  Vivek S. Pai,et al.  SSDAlloc: Hybrid SSD/RAM Memory Management Made Easy , 2011, NSDI.

[2]  Laszlo A. Belady,et al.  A Study of Replacement Algorithms for Virtual-Storage Computer , 1966, IBM Syst. J..

[3]  Aamer Jaleel,et al.  High performance cache replacement using re-reference interval prediction (RRIP) , 2010, ISCA.

[4]  Carey L. Williamson,et al.  ProWGen: a synthetic workload generation tool for simulation evaluation of web proxy caches , 2002, Comput. Networks.

[5]  Rina Panigrahy,et al.  Design Tradeoffs for SSD Performance , 2008, USENIX ATC.

[6]  Luiz André Barroso,et al.  The tail at scale , 2013, CACM.

[7]  B. Gopinath,et al.  An inter-reference gap model for temporal locality in program behavior , 1995, SIGMETRICS '95/PERFORMANCE '95.

[8]  Jongmoo Choi,et al.  VSSIM: Virtual machine based SSD simulator , 2013, 2013 IEEE 29th Symposium on Mass Storage Systems and Technologies (MSST).

[9]  Dennis Shasha,et al.  2Q: A Low Overhead High Performance Buffer Management Replacement Algorithm , 1994, VLDB.

[10]  J. Spencer Love,et al.  Caching strategies to improve disk system performance , 1994, Computer.

[11]  Philip Shilane,et al.  Characterization of Incremental Data Changes for Efficient Data Protection , 2013, USENIX Annual Technical Conference.

[12]  Olivier Temam,et al.  Investigating optimal local memory performance , 1998, ASPLOS VIII.

[13]  Xiaodong Zhang,et al.  Understanding intrinsic characteristics and system implications of flash memory based solid state drives , 2009, SIGMETRICS '09.

[14]  Cheng Li,et al.  Pannier: A Container-based Flash Cache for Compound Objects , 2015, Middleware.

[15]  Jongmoo Choi,et al.  Caching less for better performance: balancing cache size and update cost of flash memory cache in hybrid storage systems , 2012, FAST.

[16]  Aamer Jaleel,et al.  Adaptive insertion policies for high performance caching , 2007, ISCA '07.

[17]  Michael Wu,et al.  eNVy: a non-volatile, main memory storage system , 1994, ASPLOS VI.

[18]  Binny S. Gill On Multi-level Exclusive Caching: Offline Optimality and Why Promotions Are Better Than Demotions , 2008, FAST.

[19]  Zhichao Li,et al.  On the Trade-Offs among Performance, Energy, and Endurance in a Versatile Hybrid Drive , 2015, TOS.

[20]  L. M. Sonneborn,et al.  The Bang-Bang Principle for Linear Control Systems , 1964 .

[21]  Li Fan,et al.  Web caching and Zipf-like distributions: evidence and implications , 1999, IEEE INFOCOM '99. Conference on Computer Communications. Proceedings. Eighteenth Annual Joint Conference of the IEEE Computer and Communications Societies. The Future is Now (Cat. No.99CH36320).

[22]  Youngjae Kim,et al.  DFTL: a flash translation layer employing demand-based selective caching of page-level address mappings , 2009, ASPLOS.

[23]  Sandy Irani,et al.  Cost-Aware WWW Proxy Caching Algorithms , 1997, USENIX Symposium on Internet Technologies and Systems.

[24]  Cheng Li,et al.  Nitro: A Capacity-Optimized SSD Cache for Primary Storage , 2014, USENIX Annual Technical Conference.

[25]  Yong Wang,et al.  SDF: software-defined flash for web-scale internet storage systems , 2014, ASPLOS.

[26]  Song Jiang,et al.  LIRS: an efficient low inter-reference recency set replacement policy to improve buffer cache performance , 2002, SIGMETRICS '02.

[27]  Sang-Won Lee,et al.  A log buffer-based flash translation layer using fully-associative sector translation , 2007, TECS.

[28]  Steven Swanson,et al.  Gordon: using flash memory to build fast, power-efficient clusters for data-intensive applications , 2009, ASPLOS.

[29]  Hong Jiang,et al.  Performance impact and interplay of SSD parallelism through advanced commands, allocation strategy and data granularity , 2011, ICS '11.

[30]  Ming Zhao,et al.  Write policies for host-side flash caches , 2013, FAST.

[31]  Peter Desnoyers,et al.  Erasing Belady's Limitations: In Search of Flash Cache Offline Optimality , 2016, USENIX Annual Technical Conference.

[32]  Nimrod Megiddo,et al.  ARC: A Self-Tuning, Low Overhead Replacement Cache , 2003, FAST.

[33]  Qi Zhang,et al.  Characterization of storage workload traces from production Windows Servers , 2008, 2008 IEEE International Symposium on Workload Characterization.

[34]  Cory Hill,et al.  f4: Facebook's Warm BLOB Storage System , 2014, OSDI.

[35]  Yuanyuan Zhou,et al.  The Multi-Queue Replacement Algorithm for Second Level Buffer Caches , 2001, USENIX Annual Technical Conference, General Track.

[36]  J. T. Robinson,et al.  Data cache management using frequency-based replacement , 1990, SIGMETRICS '90.

[37]  Carl Staelin,et al.  The HP AutoRAID hierarchical storage system , 1995, SOSP.

[38]  Hyojun Kim,et al.  BPLRU: A Buffer Management Scheme for Improving Random Writes in Flash Storage , 2008, FAST.

[39]  Kai Li,et al.  RIPQ: Advanced Photo Caching on Flash for Facebook , 2015, FAST.

[40]  Youngjae Kim,et al.  FlashSim: A Simulator for NAND Flash-Based Solid-State Drives , 2009, 2009 First International Conference on Advances in System Simulation.

[41]  Roy Friedman,et al.  TinyLFU: A Highly Efficient Cache Admission Policy , 2014, PDP.

[42]  Joonwon Lee,et al.  Workload Characterization and Performance Implications of Large-Scale Blog Servers , 2012, TWEB.

[43]  Antony I. T. Rowstron,et al.  Write off-loading: Practical power management for enterprise storage , 2008, TOS.

[44]  Mendel Rosenblum,et al.  The design and implementation of a log-structured file system , 1991, SOSP '91.

[45]  Angela Demke Brown,et al.  Reliable Writeback for Client-side Flash Caches , 2014, USENIX Annual Technical Conference.

[46]  Michael M. Swift,et al.  FlashTier: a lightweight, consistent and durable storage cache , 2012, EuroSys '12.

[47]  Yunpeng Chai,et al.  Elastic Queue: A Universal SSD Lifetime Extension Plug-in for Cache Replacement Algorithms , 2016, SYSTOR.

[48]  Jin-Soo Kim,et al.  FAB: flash-aware buffer management policy for portable media players , 2006, IEEE Transactions on Consumer Electronics.

[49]  Jun Wang,et al.  WOLF - A Novel Reordering Write Buffer to Boost the Performance of Log-Structured File Systems , 2002, FAST.

[50]  Yannis Smaragdakis,et al.  EELRU: simple and effective adaptive page replacement , 1999, SIGMETRICS '99.