RUE: A caching method for identifying and managing hot data by leveraging resource utilization efficiency

In this study, we propose a caching method called RUE for dynamic large‐scale data streams. We define a data model to facilitate hot data identification and management. At the heart of RUE model is hot degree that takes into account two factors data resource utilization efficiency and reuse distance, aiming to quantitatively reflect data popularity in a dynamic data stream. Based on data's hot degree, RUE classifies data into four types, each of which is assigned with an associated cache residence time. Guided by RUE model, we develop HM algorithm to identify and manage hot data in a dynamic data stream. HM algorithm is implemented by four stacks, namely, new stack, short stack, long stack, and temp stack. Moreover, an eviction and a migration algorithms are integrated into HM to facilitate block replacement and migration. To evaluate the performance of HM algorithm, we quantitatively compare the performance of RUE with three state‐of‐art algorithms, namely, LRU, LIRS, and ARC under various replacement policies, operations, and workloads. Experimental results show that RUE outperforms these three existing algorithms in terms of both read and write hit rates. Furthermore, we show that with the four stacks in place, the computing overhead of HM is negligible.

[1]  Martin F. Arlitt,et al.  Evaluating content management techniques for Web proxy caches , 2000, PERV.

[2]  J. Spencer Love,et al.  Caching strategies to improve disk system performance , 1994, Computer.

[3]  J. T. Robinson,et al.  Data cache management using frequency-based replacement , 1990, SIGMETRICS '90.

[4]  Randal E. Bryant,et al.  Introducing computer systems from a programmer's perspective , 2001, SIGCSE '01.

[5]  Jin Sun,et al.  Improving Availability of Multicore Real-Time Systems Suffering Both Permanent and Transient Faults , 2019, IEEE Transactions on Computers.

[6]  David A. Patterson,et al.  Computer Architecture: A Quantitative Approach , 1969 .

[7]  Nimrod Megiddo,et al.  ARC: A Self-Tuning, Low Overhead Replacement Cache , 2003, FAST.

[8]  Calvin Lin,et al.  Rethinking Belady's Algorithm to Accommodate Prefetching , 2018, 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA).

[9]  Ziqi Fan,et al.  H-ARC: A non-volatile memory based cache policy for solid state drives , 2014, 2014 30th Symposium on Mass Storage Systems and Technologies (MSST).

[10]  Akanksha Jain,et al.  Back to the Future: Leveraging Belady's Algorithm for Improved Cache Replacement , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).

[11]  Sejin Park,et al.  FRD: A Filtering based Buffer Cache Algorithm that Considers both Frequency and Reuse Distance , 2017 .

[12]  Calvin Lin,et al.  Applying Deep Learning to the Cache Replacement Problem , 2019, MICRO.

[13]  Junlong Zhou,et al.  Dependable Scheduling for Real-Time Workflows on Cyber–Physical Cloud Systems , 2021, IEEE Transactions on Industrial Informatics.

[14]  George Karakostas,et al.  Exploitation of different types of locality for Web caches , 2002, Proceedings ISCC 2002 Seventh International Symposium on Computers and Communications.

[15]  Roy Friedman,et al.  TinyLFU: A Highly Efficient Cache Admission Policy , 2014, PDP.

[16]  Leonid B. Sokolinsky,et al.  LFU-K: An Effective Buffer Management Replacement Algorithm , 2004, DASFAA.

[17]  Tei-Wei Kuo,et al.  Efficient identification of hot data for flash memory storage systems , 2006, TOS.

[18]  Carole-Jean Wu,et al.  SHiP: Signature-based Hit Predictor for high performance caching , 2011, 2011 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[19]  Yuhui Deng,et al.  What is the future of disk drives, death or rebirth? , 2011, ACM Comput. Surv..

[20]  Sang Lyul Min,et al.  LRFU: A Spectrum of Policies that Subsumes the Least Recently Used and Least Frequently Used Policies , 2001, IEEE Trans. Computers.

[21]  Antony I. T. Rowstron,et al.  Write off-loading: Practical power management for enterprise storage , 2008, TOS.

[22]  Dharmendra S. Modha,et al.  CAR: Clock with Adaptive Replacement , 2004, FAST.

[23]  Yuhui Deng,et al.  Deconstructing on-board disk cache by using block-level real traces , 2012, Simul. Model. Pract. Theory.

[24]  Laszlo A. Belady,et al.  A Study of Replacement Algorithms for Virtual-Storage Computer , 1966, IBM Syst. J..

[25]  Yuhui Deng,et al.  Architectures and optimization methods of flash memory based storage systems , 2011, J. Syst. Archit..

[26]  Song Jiang,et al.  CLOCK-Pro: An Effective Improvement of the CLOCK Replacement , 2005, USENIX Annual Technical Conference, General Track.

[27]  Patrick P. C. Lee,et al.  Parity logging with reserved space: towards efficient updates and recovery in erasure-coded clustered storage , 2014, FAST.

[28]  Yuhui Deng,et al.  Exploring the performance impact of stripe size on network attached storage systems , 2008, J. Syst. Archit..

[29]  Dennis Shasha,et al.  2Q: A Low Overhead High Performance Buffer Management Replacement Algorithm , 1994, VLDB.

[30]  LIRS: an efficient low inter-reference recency set replacement policy to improve buffer cache performance , 2002, SIGMETRICS.

[31]  Ruei-Chuan Chang,et al.  Managing flash memory in personal communication devices , 1997, ISCE '97. Proceedings of 1997 IEEE International Symposium on Consumer Electronics (Cat. No.97TH8348).

[32]  Junlong Zhou,et al.  Security-Critical Energy-Aware Task Scheduling for Heterogeneous Real-Time MPSoCs in IoT , 2020, IEEE Transactions on Services Computing.

[33]  Samira Manabi Khan,et al.  Sampling Dead Block Prediction for Last-Level Caches , 2010, 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture.

[34]  Daniel Sánchez,et al.  Maximizing Cache Performance Under Uncertainty , 2017, 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA).

[35]  Yoon Joon Lee Database Systems for Advances Applications, 9th International Conference, DASFAA 2004, Jeju Island, Korea, March 17-19, 2004, Proceedings , 2004, DASFAA.

[36]  Tei-Wei Kuo,et al.  An adaptive striping architecture for flash memory storage systems of embedded systems , 2002, Proceedings. Eighth IEEE Real-Time and Embedded Technology and Applications Symposium.

[37]  Li Fan,et al.  Web caching and Zipf-like distributions: evidence and implications , 1999, IEEE INFOCOM '99. Conference on Computer Communications. Proceedings. Eighteenth Annual Joint Conference of the IEEE Computer and Communications Societies. The Future is Now (Cat. No.99CH36320).

[38]  Gerhard Weikum,et al.  An optimality proof of the LRU-K page replacement algorithm , 1999, JACM.

[39]  Radu Stoica,et al.  Identifying hot and cold data in main-memory databases , 2013, 2013 IEEE 29th International Conference on Data Engineering (ICDE).