论文信息 - Multi-module fully abstract compilation (extended abstract)

Multi-module fully abstract compilation (extended abstract)

Marco Patrignani, Dominique Devriese, Frank Piessens iMinds-DistriNet, KU Leuven, Belgium High-level languages like Java or ML support abstraction and data encapsulation through language features such as modules, objects, classes, and/or abstract data types. But traditional compilation does not preserve such abstraction boundaries. At machine code level, there is just a single address-space where all code is readable and all data is read/writable. In other words, the entire high-level program is compiled down into one single protection domain. To a large extent, this is the case because the protection domain granularity of modern execution platforms is very coarse grained: the smallest unit of protection is an operating system process, and most programs are compiled to a single process. For fully safe languages, and if attackers can only provide input to and read output from programs, there is no need to preserve abstraction or protection boundaries after compilation: for such attackers, language safety is sufficient to guarantee that program abstractions are maintained. However, most compiled languages (including for instance C#, Java, Go, and Rust) are not fully safe: programs can contain unsafe blocks that might be subject to memory safety errors [1], or programs can interface with code written in unsafe languages through a native interface. In addition, attackers may have more powers than just the abilities to provide input and read output: for instance, programs might support binary plugins making it possible for attackers to load arbitrary machine code into a process, or kernel-level malware can inspect any user process at the machine code abstraction level. In these circumstances, mechanisms for protecting source code data encapsulation and abstractions even after compilation to machine code level are a valuable additional layer of defense. Fortunately, in recent years there has been renewed interest in execution platforms that support fine-grained protection domains. Notable examples include protected module architectures [2], [3], [4], capability-enhanced processors [5], or general meta-data tracking processors [6]. The upcoming Intel Software Guard eXtensions (SGX) [7] will bring such fine-grained protection to mainstream processors. With the availability of fine-grained protection domains at machine code level, researchers have started exploring compilation techniques that build on such low-level protection to maintain source code level abstractions at machine code level, even in the presence of attackers that can perform machine code injection attacks. The correctness criterion for such compilation is the notion of full abstraction [8]. Roughly speaking, compilation is fully abstract if any attack that an arbitrary machine code context can do against a compiled module, could also be done by a source code context of that module based on the source code semantics. More precisely, the compiler must preserve and reflect contextual equivalence between its source and target languages. Two modules M1 and M2 are contextually equivalent M1 ∼ M2 if there is no context C that can distinguish M1 and M2 in the sense that C[M1] diverges and C[M2] does not. Compilation is fully abstract if it holds that: M1 ∼ M2 iff JM1K ∼ JM2K with JMK the compilation of M . In previous work [9], [10], we have proposed fully abstract compilation techniques for Java-like languages towards an execution platform with a protected module architecture that supports a single protected module living in an unprotected machine code context. One specific source code module M is compiled to a single machine code level protected module JMK with the guarantee that machine code contexts can only perform attacks against JMK that could also be done by means of a source code context against M. As a consequence, this single source code module M is protected against code injection attacks originating from outside that module. This paper reports on our ongoing work to generalize this to compilation schemes that support compilation to multiple protected modules. Such a generalization is useful. The limitation to one protected module at run time makes sense in the case of a safe source level programming language, where only one module (the run time library) contains native code that could have memory safety vulnerabilities. In this case, all the safe source code modules are compiled together in the single target platform protected module. But as soon as an application has more than one module that could potentially contain safety vulnerabilities (or alternatively if some modules could have been subject to tampering with the compiled version of the module), the limitation to one target platform protected module is unsatisfactory. To protect module M1 against exploitation of safety vulnerabilities within module M2 at run time, JM1K must be in a different protected module than JM2K. Our generalization makes it possible to compile each source code module to a corresponding target platform protected module, thus limiting the impact of code injection attacks against each individual module. It is also a non-trivial generalization, as malicious machine code contexts can now try to intervene in the interactions between two protected modules. A concrete example of where the existing compilation schemes [10] fail, is the way in which object sharing is handled. Patrignani et al.’s compilation scheme protects against the context guessing private object identities by maintaining a table of all objects that have been shared with the context, and using an index into that table as the identity of an object outside the protected module. This protection is insufficient in the multi-module case, as it does not protect against a scenario where module M1 shares an object O with M2 but not with M3. JM3K can now guess the object identity of O and call methods on it, thus breaking full abstraction. In this work-in-progress, we investigate the additional complications that arise for multi-module fully abstract compilation, show how some of the proposed execution platforms with fine-grained protection lack features needed to support such compilation, and develop two approaches where the first one achieves a form of probabilistic full abstraction on SGXlike systems, and the other one achieves full abstraction on a specific variant of capability-enhanced hardware.

Dominique Devriese | Marco Patrignani | Frank Piessens

[1] Peter G. Neumann,et al. The CHERI capability model: Revisiting RISC in an age of risk , 2014, 2014 ACM/IEEE 41st International Symposium on Computer Architecture (ISCA).

[2] Michael K. Reiter,et al. Flicker: an execution infrastructure for tcb minimization , 2008, Eurosys '08.

[3] Marco Patrignani,et al. Secure Compilation to Protected Module Architectures , 2015, TOPL.

[4] Frank Piessens,et al. Sancus: Low-cost Trustworthy Extensible Networked Devices with a Zero-software Trusted Computing Base , 2013, USENIX Security Symposium.

[5] Martín Abadi,et al. Protection in Programming-Language Translations , 1998, ICALP.

[6] Carlos V. Rozas,et al. Innovative instructions and software model for isolated execution , 2013, HASP '13.

[7] Jonathan M. Smith,et al. Architectural Support for Software-Defined Metadata Processing , 2015, ASPLOS.

[8] Frank Piessens,et al. Fides: selectively hardening software application components against kernel-level or process-level malware , 2012, CCS '12.

[9] Wouter Joosen,et al. Runtime countermeasures for code injection attacks against C and C++ programs , 2012, CSUR.

[10] Frank Piessens,et al. Secure Compilation to Modern Processors , 2012, 2012 IEEE 25th Computer Security Foundations Symposium.