Clearing the Shadows: Recovering Lost Performance for Invisible Speculative Execution through HW/SW Co-Design

Out-of-order processors heavily rely on speculation to achieve high performance, allowing instructions to bypass other slower instructions in order to fully utilize the processor's resources. Speculatively executed instructions do not affect the correctness of the application, as they never change the architectural state, but they do affect the micro-architectural behavior of the system. Until recently, these changes were considered to be safe but with the discovery of new security attacks that misuse speculative execution to leak secrete information through observable micro-architectural changes (so called side-channels), this is no longer the case. To solve this issue, a wave of software and hardware mitigations have been proposed, the majority of which delay and/or hide speculative execution until it is deemed to be safe, trading performance for security. These newly enforced restrictions change how speculation is applied and where the performance bottlenecks appear, forcing us to rethink how we design and optimize both the hardware and the software. We observe that many of the state-of-the-art hardware solutions targeting memory systems operate on a common scheme: the visible execution of loads or their dependents is blocked until they become safe to execute. In this work we propose a generally applicable hardware-software extension that focuses on removing the causes of loads' unsafety, generally caused by control and memory dependence speculation. As a result, we manage to make more loads safe to execute at an early stage, which enables us to schedule more loads at a time to overlap their delays and improve performance. We apply our techniques on the state-of-the-art Delay-on-Miss hardware defense and show that we reduce the performance gap to the unsafe baseline by 53% (on average).

[1]  Toon Verwaest,et al.  Spectre is here to stay: An analysis of side-channels and speculative execution , 2019, ArXiv.

[2]  Martin Schwarzl,et al.  NetSpectre: Read Arbitrary Memory over Network , 2018, ESORICS.

[3]  Thomas F. Wenisch,et al.  Foreshadow: Extracting the Keys to the Intel SGX Kingdom with Transient Out-of-Order Execution , 2018, USENIX Security Symposium.

[4]  Lizy Kurian John,et al.  Analysis of redundancy and application balance in the SPEC CPU2006 benchmark suite , 2007, ISCA '07.

[5]  Josep Torrellas,et al.  OmniOrder: Directory-based conflict serialization of transactions , 2014, 2014 ACM/IEEE 41st International Symposium on Computer Architecture (ISCA).

[6]  Dan Meng,et al.  Conditional Speculation: An Effective Approach to Safeguard Out-of-Order Execution Against Spectre Attacks , 2019, 2019 IEEE International Symposium on High Performance Computer Architecture (HPCA).

[7]  Stefan Mangard,et al.  KASLR is Dead: Long Live KASLR , 2017, ESSoS.

[8]  Heechul Yun,et al.  SpectreGuard: An Efficient Data-centric Defense Mechanism against Spectre Attacks , 2019, 2019 56th ACM/IEEE Design Automation Conference (DAC).

[9]  Vikram S. Adve,et al.  LLVM: a compilation framework for lifelong program analysis & transformation , 2004, International Symposium on Code Generation and Optimization, 2004. CGO 2004..

[10]  Babak Falsafi,et al.  SMoTherSpectre: Exploiting Speculative Execution through Port Contention , 2019, CCS.

[11]  Stefanos Kaxiras,et al.  Efficient Invisible Speculative Execution through Selective Delay and Value Prediction , 2019, 2019 ACM/IEEE 46th Annual International Symposium on Computer Architecture (ISCA).

[12]  Gernot Heiser,et al.  A survey of microarchitectural timing attacks and countermeasures on contemporary hardware , 2016, Journal of Cryptographic Engineering.

[13]  Josep Torrellas,et al.  Speculative Data-Oblivious Execution: Mobilizing Safe Prediction For Safe and Efficient Speculative Execution , 2020, 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA).

[14]  Josep Torrellas,et al.  MicroScope: Enabling Microarchitectural Replay Attacks , 2019, 2019 ACM/IEEE 46th Annual International Symposium on Computer Architecture (ISCA).

[15]  Josep Torrellas,et al.  InvisiSpec: Making Speculative Execution Invisible in the Cache Hierarchy , 2018, 2018 51st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[16]  Michael Hamburg,et al.  Spectre Attacks: Exploiting Speculative Execution , 2018, 2019 IEEE Symposium on Security and Privacy (SP).

[17]  Stefanos Kaxiras,et al.  Ghost loads: what is the cost of invisible speculation? , 2019, CF.

[18]  E CarlsonTrevor,et al.  Non-Speculative Load-Load Reordering in TSO , 2017 .

[19]  Ofir Weisse,et al.  NDA: Preventing Speculative Execution Attacks at Their Source , 2019, MICRO.

[20]  Somayeh Sardashti,et al.  The gem5 simulator , 2011, CARN.

[21]  Sam Ainsworth,et al.  MuonTrap: Preventing Cross-Domain Spectre-Like Attacks by Capturing Speculative State , 2020, 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA).

[22]  Daniel Gruss,et al.  ZombieLoad: Cross-Privilege-Boundary Data Sampling , 2019, CCS.

[23]  Gorka Irazoqui Apecechea,et al.  Cross Processor Cache Attacks , 2016, IACR Cryptol. ePrint Arch..

[24]  Eric Rotenberg,et al.  Control-Flow Decoupling: An Approach for Timely, Non-Speculative Branching , 2015, IEEE Transactions on Computers.

[25]  Gururaj Saileshwar,et al.  CleanupSpec: An "Undo" Approach to Safe Speculation , 2019, MICRO.

[26]  Prabhat Mishra,et al.  A Survey of Side-Channel Attacks on Caches and Countermeasures , 2017, Journal of Hardware and Systems Security.

[27]  Josep Torrellas,et al.  Speculative Taint Tracking (STT): A Comprehensive Protection for Speculatively Accessed Data , 2019, IEEE Micro.

[28]  Yinqian Zhang,et al.  SgxPectre: Stealing Intel Secrets From SGX Enclaves via Speculative Execution , 2020, IEEE Security & Privacy.

[29]  Frank Piessens,et al.  A Systematic Evaluation of Transient Execution Attacks and Defenses , 2018, USENIX Security Symposium.

[30]  A. Jaleel Memory Characterization of Workloads Using Instrumentation-Driven Simulation A Pin-based Memory Characterization of the SPEC CPU 2000 and SPEC CPU 2006 Benchmark Suites , 2022 .

[31]  John L. Henning SPEC CPU2006 benchmark descriptions , 2006, CARN.

[32]  Thomas F. Wenisch,et al.  Foreshadow-NG: Breaking the virtual memory abstraction with transient out-of-order execution , 2018 .

[33]  Nael B. Abu-Ghazaleh,et al.  SafeSpec: Banishing the Spectre of a Meltdown with Leakage-Free Speculation , 2018, 2019 56th ACM/IEEE Design Automation Conference (DAC).

[34]  Nael B. Abu-Ghazaleh,et al.  Spectre Returns! Speculation Attacks Using the Return Stack Buffer , 2018, IEEE Design & Test.

[35]  Stefanos Kaxiras,et al.  Non-speculative load-load reordering in TSO , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).

[36]  K JohnLizy,et al.  Analysis of redundancy and application balance in the SPEC CPU2006 benchmark suite , 2007 .