Erasing Belady's Limitations: In Search of Flash Cache Offline Optimality

NAND-based solid-state (flash) drives are known for providing better performance than magnetic disk drives, but they have limits on endurance, the number of times data can be erased and overwritten. Furthermore, the unit of erasure can be many times larger than the basic unit of I/O; this leads to complexity with respect to consolidating live data and erasing obsolete data. When flash drives are used as a cache for a larger, disk-based storage system, the choice of a cache replacement algorithm can make a significant difference in both performance and endurance. While there are many cache replacement algorithms, their effectiveness is hard to judge due to the lack of a baseline against which to compare them: Belady's MIN, the usual offline best-case algorithm, considers read hit ratio but not endurance. We explore offline algorithms for flash caching in terms of both hit ratio and flash lifespan. We design and implement a multi-stage heuristic by synthesizing several techniques that manage data at the granularity of a flash erasure unit (which we call a container) to approximate the offline optimal algorithm. We find that simple techniques contribute most of the available erasure savings. Our evaluation shows that the container-optimized offline heuristic is able to provide the same optimal read hit ratio as MIN with 67% fewer flash erasures. More fundamentally, our investigation provides a useful approximate baseline for evaluating any online algorithm, highlighting the importance of comparing new policies for caching compound blocks in flash.

[1]  Antony I. T. Rowstron,et al.  Write off-loading: Practical power management for enterprise storage , 2008, TOS.

[2]  Nisha Talagala,et al.  HEC: improving endurance of high performance flash-based cache devices , 2013, SYSTOR '13.

[3]  Joo Young Hwang,et al.  F2FS: A New File System for Flash Storage , 2015, FAST.

[4]  Richard J. Enbody,et al.  Optimal replacement is NP-hard for nonstandard caches , 2004, IEEE Transactions on Computers.

[5]  J. Spencer Love,et al.  Caching strategies to improve disk system performance , 1994, Computer.

[6]  Yong Wang,et al.  SDF: software-defined flash for web-scale internet storage systems , 2014, ASPLOS.

[7]  Laszlo A. Belady,et al.  A Study of Replacement Algorithms for Virtual-Storage Computer , 1966, IBM Syst. J..

[8]  Sanjeev Khanna,et al.  Page replacement for general caching problems , 1999, SODA '99.

[9]  Bharath Ramsundar,et al.  NVMKV: A Scalable and Lightweight Flash Aware Key-Value Store , 2014, HotStorage.

[10]  Dan Feng,et al.  Improving flash-based disk cache with Lazy Adaptive Replacement , 2013, 2013 IEEE 29th Symposium on Mass Storage Systems and Technologies (MSST).

[11]  Marek Chrobak,et al.  Caching Is Hard—Even in the Fault Model , 2012, Algorithmica.

[12]  Jongmoo Choi,et al.  Caching less for better performance: balancing cache size and update cost of flash memory cache in hybrid storage systems , 2012, FAST.

[13]  Sivan Toledo,et al.  Competitive analysis of flash memory algorithms , 2011, TALG.

[14]  Kai Li,et al.  RIPQ: Advanced Photo Caching on Flash for Facebook , 2015, FAST.

[15]  Irving L. Traiger,et al.  Evaluation Techniques for Storage Hierarchies , 1970, IBM Syst. J..

[16]  Cheng Li,et al.  Nitro: A Capacity-Optimized SSD Cache for Primary Storage , 2014, USENIX Annual Technical Conference.

[17]  Philip Shilane,et al.  Characterization of Incremental Data Changes for Efficient Data Protection , 2013, USENIX Annual Technical Conference.

[18]  Olivier Temam,et al.  Investigating optimal local memory performance , 1998, ASPLOS VIII.

[19]  Cheng Li,et al.  Pannier: A Container-based Flash Cache for Compound Objects , 2015, Middleware.

[20]  Lyle A. McGeoch,et al.  A strongly competitive randomized paging algorithm , 1991, Algorithmica.

[21]  Jin Li,et al.  FlashStore , 2010, Proc. VLDB Endow..

[22]  Binny S. Gill On Multi-level Exclusive Caching: Offline Optimality and Why Promotions Are Better Than Demotions , 2008, FAST.

[23]  Jongmoo Choi,et al.  Enabling Cost-Effective Flash based Caching with an Array of Commodity SSDs , 2015, Middleware.

[24]  Angela Demke Brown,et al.  Reliable Writeback for Client-side Flash Caches , 2014, USENIX Annual Technical Conference.

[25]  Michael M. Swift,et al.  FlashTier: a lightweight, consistent and durable storage cache , 2012, EuroSys '12.

[26]  Ming Zhao,et al.  Write policies for host-side flash caches , 2013, FAST.

[27]  Jason Liu,et al.  To ARC or Not to ARC , 2015, HotStorage.

[28]  Qi Zhang,et al.  Characterization of storage workload traces from production Windows Servers , 2008, 2008 IEEE International Symposium on Workload Characterization.

[29]  Y. Charlie Hu,et al.  Program-Counter-Based Pattern Classification in Buffer Caching , 2004, OSDI.

[30]  Hong Jiang,et al.  RACE: A Robust Adaptive Caching Strategy for Buffer Cache , 2008, IEEE Transactions on Computers.

[31]  Y. Charlie Hu,et al.  The Performance Impact of Kernel Prefetching on Buffer Cache Replacement Algorithms , 2005, IEEE Transactions on Computers.

[32]  Ziqi Fan,et al.  H-ARC: A non-volatile memory based cache policy for solid state drives , 2014, 2014 30th Symposium on Mass Storage Systems and Technologies (MSST).

[33]  Margo I. Seltzer,et al.  Flash Caching on the Storage Client , 2013, USENIX Annual Technical Conference.

[34]  Richard M. Karp,et al.  Index Register Allocation , 1966, JACM.

[35]  Jihong Kim,et al.  Application-Managed Flash , 2016, FAST.