Defending Against Flush+Reload Attack With DRAM Cache by Bypassing Shared SRAM Cache

Cache side-channel attack is one of the critical security threats to modern computing systems. As a representative cache side-channel attack, Flush+Reload attack allows an attacker to steal confidential information (e.g., private encryption key) by monitoring a victim’s cache access patterns while generating the confidential values. Meanwhile, for providing high performance with memory-intensive applications that do not fit in the on-chip SRAM-based last-level cache (e.g., L3 cache), modern computing systems start to deploy DRAM cache between the SRAM-based last-level cache and the main memory DRAM, which can provide low latency and/or high bandwidth. However, in this work, we propose an approach that exploits the DRAM cache for security rather than performance, called <monospace>ByCA</monospace>. <monospace>ByCA</monospace> bypasses the L3 shared cache when accessing cache blocks suspected as target blocks of an attacker. Consequently, <monospace>ByCA</monospace> eliminates the timing difference when the attacker accesses the target cache blocks, nullifying the Flush+Reload attacks. To this end, <monospace>ByCA</monospace> keeps cache blocks suspected as target blocks of the attacker and stores their states (i.e., flushed by <monospace>clflush</monospace> or not) in the L4 DRAM cache even with <monospace>clflush</monospace> instruction; <monospace>ByCA re-defines</monospace> and re-implements <monospace>clflush</monospace> instruction not to flush cache blocks from the L4 DRAM cache while flushing the blocks from other level caches (i.e., L1, L2, and L3 caches). In addition, <monospace>ByCA</monospace> bypasses L3 cache when the attacker or the victim accesses the target blocks flushed by <monospace>clflush</monospace>, making the attacker always obtain the blocks from L4 DRAM cache regardless of the victim’s access patterns. Consequently, <monospace>ByCA</monospace> eliminates the timing difference, thus the attacker cannot monitor the victim’s cache access patterns. For L4 DRAM cache, we implement Alloy Cache design and use an unused bit in a tag entry for each block to store its state. <monospace>ByCA</monospace> only requires a single bit extension to cache blocks in L1 and L2 private caches, and a tag entry for each block in the L4 DRAM cache. Our experimental results show that <monospace>ByCA</monospace> completely eliminates the timing differences when the attacker reloads the target blocks. Furthermore, <monospace>ByCA</monospace> does not show the performance degradation for the victim while co-running with the attacker that flushes and reloads target blocks temporally and repetitively.

[1]  Jeffrey S. Vetter,et al.  A Survey Of Techniques for Architecting DRAM Caches , 2016, IEEE Transactions on Parallel and Distributed Systems.

[2]  Adi Shamir,et al.  Cache Attacks and Countermeasures: The Case of AES , 2006, CT-RSA.

[3]  Andrew Ferraiuolo,et al.  SecDCP: Secure dynamic cache partitioning for efficient timing channel protection , 2016, 2016 53nd ACM/EDAC/IEEE Design Automation Conference (DAC).

[4]  Michael Hamburg,et al.  Spectre Attacks: Exploiting Speculative Execution , 2018, 2019 IEEE Symposium on Security and Privacy (SP).

[5]  Srinivas Devadas,et al.  DAWG: A Defense Against Cache Timing Attacks in Speculative Execution Processors , 2018, 2018 51st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[6]  Cheng-Chieh Huang,et al.  ATCache: Reducing DRAM cache latency via a small SRAM tag cache , 2014, 2014 23rd International Conference on Parallel Architecture and Compilation (PACT).

[7]  Gernot Heiser,et al.  CATalyst: Defeating last-level cache side channel attacks in cloud computing , 2016, 2016 IEEE International Symposium on High Performance Computer Architecture (HPCA).

[8]  Li Zhao,et al.  Exploring DRAM cache architectures for CMP server platforms , 2007, 2007 25th International Conference on Computer Design.

[9]  Josep Torrellas,et al.  Secure hierarchy-aware cache replacement policy (SHARP): Defending against cache-based side channel attacks , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).

[10]  Mark D. Hill,et al.  Efficiently enabling conventional block sizes for very large die-stacked DRAM caches , 2011, 2011 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[11]  Yen-Chen Liu,et al.  Knights Landing: Second-Generation Intel Xeon Phi Product , 2016, IEEE Micro.

[12]  Joonyoung Kim,et al.  HBM: Memory solution for bandwidth-hungry processors , 2014, 2014 IEEE Hot Chips 26 Symposium (HCS).

[13]  Sparsh Mittal,et al.  A Survey of Techniques for Cache Partitioning in Multicore Processors , 2017, ACM Comput. Surv..

[14]  Craig Disselkoen,et al.  Prime+Abort: A Timer-Free High-Precision L3 Cache Attack using Intel TSX , 2017, USENIX Security Symposium.

[15]  Mehmet Kayaalp,et al.  RIC: Relaxed Inclusion Caches for mitigating LLC side-channel attacks , 2017, 2017 54th ACM/EDAC/IEEE Design Automation Conference (DAC).

[16]  Gabriel H. Loh,et al.  Fundamental Latency Trade-off in Architecting DRAM Caches: Outperforming Impractical SRAM-Tags with a Simple and Practical Design , 2012, 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture.

[17]  J. Thomas Pawlowski,et al.  Hybrid memory cube (HMC) , 2011, 2011 IEEE Hot Chips 23 Symposium (HCS).

[18]  Michael Hamburg,et al.  Meltdown: Reading Kernel Memory from User Space , 2018, USENIX Security Symposium.

[19]  Klaus Wagner,et al.  Flush+Flush: A Fast and Stealthy Cache Attack , 2015, DIMVA.

[20]  Lieven Eeckhout,et al.  Sniper: Exploring the level of abstraction for scalable and accurate parallel multi-core simulation , 2011, 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[21]  Aamer Jaleel,et al.  CANDY: Enabling coherent DRAM caches for multi-node systems , 2016, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[22]  Boris Grot,et al.  Farewell My Shared LLC! A Case for Private Die-Stacked DRAM Caches for Servers , 2018, 2018 51st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[23]  Moinuddin K. Qureshi,et al.  DICE: Compressing DRAM caches for bandwidth and capacity , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).

[24]  Christoforos E. Kozyrakis,et al.  Memory Hierarchy for Web Search , 2018, 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA).

[25]  Stefan Mangard,et al.  Cache Template Attacks: Automating Attacks on Inclusive Last-Level Caches , 2015, USENIX Security Symposium.

[26]  Yuval Yarom,et al.  FLUSH+RELOAD: A High Resolution, Low Noise, L3 Cache Side-Channel Attack , 2014, USENIX Security Symposium.