论文信息 - Defending Against Flush+Reload Attack With DRAM Cache by Bypassing Shared SRAM Cache

Defending Against Flush+Reload Attack With DRAM Cache by Bypassing Shared SRAM Cache

Cache side-channel attack is one of the critical security threats to modern computing systems. As a representative cache side-channel attack, Flush+Reload attack allows an attacker to steal confidential information (e.g., private encryption key) by monitoring a victim’s cache access patterns while generating the confidential values. Meanwhile, for providing high performance with memory-intensive applications that do not fit in the on-chip SRAM-based last-level cache (e.g., L3 cache), modern computing systems start to deploy DRAM cache between the SRAM-based last-level cache and the main memory DRAM, which can provide low latency and/or high bandwidth. However, in this work, we propose an approach that exploits the DRAM cache for security rather than performance, called <monospace>ByCA</monospace>. <monospace>ByCA</monospace> bypasses the L3 shared cache when accessing cache blocks suspected as target blocks of an attacker. Consequently, <monospace>ByCA</monospace> eliminates the timing difference when the attacker accesses the target cache blocks, nullifying the Flush+Reload attacks. To this end, <monospace>ByCA</monospace> keeps cache blocks suspected as target blocks of the attacker and stores their states (i.e., flushed by <monospace>clflush</monospace> or not) in the L4 DRAM cache even with <monospace>clflush</monospace> instruction; <monospace>ByCA re-defines</monospace> and re-implements <monospace>clflush</monospace> instruction not to flush cache blocks from the L4 DRAM cache while flushing the blocks from other level caches (i.e., L1, L2, and L3 caches). In addition, <monospace>ByCA</monospace> bypasses L3 cache when the attacker or the victim accesses the target blocks flushed by <monospace>clflush</monospace>, making the attacker always obtain the blocks from L4 DRAM cache regardless of the victim’s access patterns. Consequently, <monospace>ByCA</monospace> eliminates the timing difference, thus the attacker cannot monitor the victim’s cache access patterns. For L4 DRAM cache, we implement Alloy Cache design and use an unused bit in a tag entry for each block to store its state. <monospace>ByCA</monospace> only requires a single bit extension to cache blocks in L1 and L2 private caches, and a tag entry for each block in the L4 DRAM cache. Our experimental results show that <monospace>ByCA</monospace> completely eliminates the timing differences when the attacker reloads the target blocks. Furthermore, <monospace>ByCA</monospace> does not show the performance degradation for the victim while co-running with the attacker that flushes and reloads target blocks temporally and repetitively.

[1] Jeffrey S. Vetter,et al. A Survey Of Techniques for Architecting DRAM Caches , 2016, IEEE Transactions on Parallel and Distributed Systems.

[2] Adi Shamir,et al. Cache Attacks and Countermeasures: The Case of AES , 2006, CT-RSA.

[3] Andrew Ferraiuolo,et al. SecDCP: Secure dynamic cache partitioning for efficient timing channel protection , 2016, 2016 53nd ACM/EDAC/IEEE Design Automation Conference (DAC).

[4] Michael Hamburg,et al. Spectre Attacks: Exploiting Speculative Execution , 2018, 2019 IEEE Symposium on Security and Privacy (SP).

[5] Srinivas Devadas,et al. DAWG: A Defense Against Cache Timing Attacks in Speculative Execution Processors , 2018, 2018 51st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[6] Cheng-Chieh Huang,et al. ATCache: Reducing DRAM cache latency via a small SRAM tag cache , 2014, 2014 23rd International Conference on Parallel Architecture and Compilation (PACT).

[7] Gernot Heiser,et al. CATalyst: Defeating last-level cache side channel attacks in cloud computing , 2016, 2016 IEEE International Symposium on High Performance Computer Architecture (HPCA).

[8] Li Zhao,et al. Exploring DRAM cache architectures for CMP server platforms , 2007, 2007 25th International Conference on Computer Design.

[9] Josep Torrellas,et al. Secure hierarchy-aware cache replacement policy (SHARP): Defending against cache-based side channel attacks , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).

[10] Mark D. Hill,et al. Efficiently enabling conventional block sizes for very large die-stacked DRAM caches , 2011, 2011 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[11] Yen-Chen Liu,et al. Knights Landing: Second-Generation Intel Xeon Phi Product , 2016, IEEE Micro.

[12] Joonyoung Kim,et al. HBM: Memory solution for bandwidth-hungry processors , 2014, 2014 IEEE Hot Chips 26 Symposium (HCS).

[13] Sparsh Mittal,et al. A Survey of Techniques for Cache Partitioning in Multicore Processors , 2017, ACM Comput. Surv..

[14] Craig Disselkoen,et al. Prime+Abort: A Timer-Free High-Precision L3 Cache Attack using Intel TSX , 2017, USENIX Security Symposium.

[15] Mehmet Kayaalp,et al. RIC: Relaxed Inclusion Caches for mitigating LLC side-channel attacks , 2017, 2017 54th ACM/EDAC/IEEE Design Automation Conference (DAC).

[16] Gabriel H. Loh,et al. Fundamental Latency Trade-off in Architecting DRAM Caches: Outperforming Impractical SRAM-Tags with a Simple and Practical Design , 2012, 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture.

[17] J. Thomas Pawlowski,et al. Hybrid memory cube (HMC) , 2011, 2011 IEEE Hot Chips 23 Symposium (HCS).

[18] Michael Hamburg,et al. Meltdown: Reading Kernel Memory from User Space , 2018, USENIX Security Symposium.

[19] Klaus Wagner,et al. Flush+Flush: A Fast and Stealthy Cache Attack , 2015, DIMVA.

[20] Lieven Eeckhout,et al. Sniper: Exploring the level of abstraction for scalable and accurate parallel multi-core simulation , 2011, 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[21] Aamer Jaleel,et al. CANDY: Enabling coherent DRAM caches for multi-node systems , 2016, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[22] Boris Grot,et al. Farewell My Shared LLC! A Case for Private Die-Stacked DRAM Caches for Servers , 2018, 2018 51st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[23] Moinuddin K. Qureshi,et al. DICE: Compressing DRAM caches for bandwidth and capacity , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).

[24] Christoforos E. Kozyrakis,et al. Memory Hierarchy for Web Search , 2018, 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA).

[25] Stefan Mangard,et al. Cache Template Attacks: Automating Attacks on Inclusive Last-Level Caches , 2015, USENIX Security Symposium.

[26] Yuval Yarom,et al. FLUSH+RELOAD: A High Resolution, Low Noise, L3 Cache Side-Channel Attack , 2014, USENIX Security Symposium.