An Imitation Learning Approach for Cache Replacement

Program execution speed critically depends on increasing cache hits, as cache hits are orders of magnitude faster than misses. To increase cache hits, we focus on the problem of cache replacement: choosing which cache line to evict upon inserting a new line. This is challenging because it requires planning far ahead and currently there is no known practical solution. As a result, current replacement policies typically resort to heuristics designed for specific common access patterns, which fail on more diverse and complex access patterns. In contrast, we propose an imitation learning approach to automatically learn cache access patterns by leveraging Belady's, an oracle policy that computes the optimal eviction decision given the future cache accesses. While directly applying Belady's is infeasible since the future is unknown, we train a policy conditioned only on past accesses that accurately approximates Belady's even on diverse and complex access patterns, and call this approach Parrot. When evaluated on 13 of the most memory-intensive SPEC applications, Parrot increases cache miss rates by 20% over the current state of the art. In addition, on a large-scale web search benchmark, Parrot increases cache hit rates by 61% over a conventional LRU policy. We release a Gym environment to facilitate research in this area, as data is plentiful, and further advancements can have significant real-world impact.

[1]  Laszlo A. Belady,et al.  A Study of Replacement Algorithms for Virtual-Storage Computer , 1966, IBM Syst. J..

[2]  Dean Pomerleau,et al.  ALVINN, an autonomous land vehicle in a neural network , 2015 .

[3]  Janowsky,et al.  Pruning versus clipping in neural networks. , 1989, Physical review. A, General physics.

[4]  Norman P. Jouppi,et al.  Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers , 1990, [1990] Proceedings. The 17th Annual International Symposium on Computer Architecture.

[5]  David R. Cheriton,et al.  Application-controlled physical memory using external page-cache management , 1992, ASPLOS V.

[6]  Huaiyu Zhu On Information and Sufficiency , 1997 .

[7]  Derek Bruening,et al.  An infrastructure for adaptive dynamic optimization , 2003, International Symposium on Code Generation and Optimization, 2003. CGO 2003..

[8]  John L. Henning SPEC CPU2006 benchmark descriptions , 2006, CARN.

[9]  SPEC CPU 2006 Benchmark Descriptions , 2006 .

[10]  Onur Mutlu,et al.  A Case for MLP-Aware Cache Replacement , 2006, 33rd International Symposium on Computer Architecture (ISCA'06).

[11]  Aamer Jaleel,et al.  Adaptive insertion policies for high performance caching , 2007, ISCA '07.

[12]  John Langford,et al.  Search-based structured prediction , 2009, Machine Learning.

[13]  Tao Qin,et al.  A general approximation framework for direct optimization of information retrieval measures , 2010, Information Retrieval.

[14]  Aamer Jaleel,et al.  High performance cache replacement using re-reference interval prediction (RRIP) , 2010, ISCA.

[15]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[16]  Samira Manabi Khan,et al.  Sampling Dead Block Prediction for Last-Level Caches , 2010, 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture.

[17]  Geoffrey J. Gordon,et al.  A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning , 2010, AISTATS.

[18]  Carole-Jean Wu,et al.  SHiP: Signature-based Hit Predictor for high performance caching , 2011, 2011 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[19]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[20]  Song Jiang,et al.  Characterizing Facebook's Memcached Workload , 2014, IEEE Internet Computing.

[21]  J. Andrew Bagnell,et al.  Reinforcement and Imitation Learning via Interactive No-Regret Learning , 2014, ArXiv.

[22]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[23]  Quoc V. Le,et al.  Addressing the Rare Word Problem in Neural Machine Translation , 2014, ACL.

[24]  Christopher D. Manning,et al.  Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.

[25]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[26]  Samy Bengio,et al.  Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks , 2015, NIPS.

[27]  Sachin Katti,et al.  Cliffhanger: Scaling Performance Cliffs in Web Memory Caches , 2016, NSDI.

[28]  Song Han,et al.  Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding , 2015, ICLR.

[29]  Traian Rebedea,et al.  Playing Atari Games with Deep Reinforcement Learning and Human Checkpoint Replay , 2016, ArXiv.

[30]  Akanksha Jain,et al.  Back to the Future: Leveraging Belady's Algorithm for Improved Cache Replacement , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).

[31]  Martin A. Riedmiller,et al.  Leveraging Demonstrations for Deep Reinforcement Learning on Robotics Problems with Sparse Rewards , 2017, ArXiv.

[32]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[33]  Ali Farhadi,et al.  Bidirectional Attention Flow for Machine Comprehension , 2016, ICLR.

[34]  Vivienne Sze,et al.  Efficient Processing of Deep Neural Networks: A Tutorial and Survey , 2017, Proceedings of the IEEE.

[35]  Byron Boots,et al.  Deeply AggreVaTeD: Differentiable Imitation Learning for Sequential Prediction , 2017, ICML.

[36]  Razvan Pascanu,et al.  Learning to Navigate in Complex Environments , 2016, ICLR.

[37]  Tom Schaul,et al.  Reinforcement Learning with Unsupervised Auxiliary Tasks , 2016, ICLR.

[38]  Guillaume Lample,et al.  Playing FPS Games with Deep Reinforcement Learning , 2016, AAAI.

[39]  Gireeja Ranade,et al.  Adaptive Information Gathering via Imitation Learning , 2017, Robotics: Science and Systems.

[40]  Daniel Sánchez,et al.  Maximizing Cache Performance Under Uncertainty , 2017, 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA).

[41]  Christoforos E. Kozyrakis,et al.  Learning Memory Access Patterns , 2018, ICML.

[42]  Tom Schaul,et al.  Deep Q-learning From Demonstrations , 2017, AAAI.

[43]  Hongzi Mao,et al.  Learning Caching Policies with Subsampling , 2019 .

[44]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[45]  Calvin Lin,et al.  Applying Deep Learning to the Cache Replacement Problem , 2019, MICRO.

[46]  Mohammad Norouzi,et al.  Optimal Completion Distillation for Sequence Learning , 2018, ICLR.

[47]  Daniel Tarlow,et al.  Learning Execution through Neural Code Fusion , 2019, ICLR.

[48]  A. Jaleel Memory Characterization of Workloads Using Instrumentation-Driven Simulation A Pin-based Memory Characterization of the SPEC CPU 2000 and SPEC CPU 2006 Benchmark Suites , 2022 .