Interpretational overhead in system software

Interpreting a program carries a runtime penalty: the interpretational overhead. Traditionally, a compiler removes interpretational overhead by sacrificing inessential details of program execution. However, a broad class of system software is based on non-standard interpretation of machine code or a higher-level language. For example, virtual machine monitors emulate privileged instructions; program instrumentation is used to build dynamic call graphs by intercepting function calls and returns; and dynamic software updating technology allows program code to be altered at runtime. Many of these frameworks are performance-sensitive and several efficiency requirements—both formal and informal—have been put forward over the last four decades. Largely independently, the concept of interpretational overhead received much attention in the partial evaluation (“program specialization”) literature. This dissertation contributes a unifying understanding of efficiency and interpretational overhead in system software. Starting from the observation that a virtual machine monitor is a self-interpreter for machine code, our first contribution is to reconcile the definition of efficient virtualization due to Popek and Goldberg with Jones optimality, a measure of the strength of program specializers. We also present a rational reconstruction of hardware virtualization support (“trap-and-emulate”) from context-threaded interpretation, a technique for implementing fast interpreters due to Berndl et al. As a form of augmented execution, virtualization shares many similarities with program instrumentation. Although several low-overhead instrumentation frameworks are available on today’s hardware, there has been no formal understanding of what it means for instrumentation to be efficient. Our second contribution is a definition of efficiency for program instrumentation in the spirit of Popek and Goldberg’s work. Instrumentation also incurs an implicit overhead because instrumentation code needs access to intermediate execution states and this is antagonistic to optimization. The third contribution is to use partial equivalence relations (PERs) to express the dependence of instrumentation on execution state, enabling an instrumentation/optimization trade-off. Since program instrumentation, applied at runtime, constitutes a kind of dynamic software update, we can similarly restrict allowable future updates to be consistent with existing optimizations. Finally, treating “old” and “new” code in a dynamically-updatable program as being written in different languages permits a semantic explanation of a safety rule that was originally introduced as a syntactic check.

[1]  Mitchell Wand,et al.  The mystery of the tower revealed: a non-reflective description of the reflective tower , 1986, LFP '86.

[2]  Robert Glück,et al.  The Translation Power of the Futamura Projections , 2003, Ershov Memorial Conference.

[3]  Scott Nettles,et al.  Dynamic software updating , 2001, PLDI '01.

[4]  Manuel Fähndrich,et al.  On the Relative Completeness of Bytecode Analysis Versus Source Code Analysis , 2008, CC.

[5]  Andrew C. Myers,et al.  Language-based information-flow security , 2003, IEEE J. Sel. Areas Commun..

[6]  John C. Reynolds,et al.  Types, Abstraction and Parametric Polymorphism , 1983, IFIP Congress.

[7]  Gerald J. Popek,et al.  Formal requirements for virtualizable third generation architectures , 1974, SOSP '73.

[8]  Barton P. Miller,et al.  Fine-grained dynamic instrumentation of commodity operating system kernels , 1999, OSDI '99.

[9]  Haibo Chen,et al.  Live updating operating systems using virtualization , 2006, VEE '06.

[10]  Christopher Krügel,et al.  Static Disassembly of Obfuscated Binaries , 2004, USENIX Security Symposium.

[11]  Philip Wadler,et al.  The marriage of effects and monads , 1998, ICFP '98.

[12]  Barak A. Pearlmutter,et al.  First-class nonstandard interpretations by opening closures , 2007, POPL '07.

[13]  David H. Ackley,et al.  Building diverse computer systems , 1997, Proceedings. The Sixth Workshop on Hot Topics in Operating Systems (Cat. No.97TB100133).

[14]  Peter Claussen Theories of programming languages , 2000, SOEN.

[15]  Simon L. Peyton Jones,et al.  Haskell Is Not Not ML , 2006, ESOP.

[16]  Dawson R. Engler,et al.  Checking system rules using system-specific, programmer-written compiler extensions , 2000, OSDI.

[17]  Luca Cardelli,et al.  Phase Distinctions in Type Theory , 1988 .

[18]  Robert Glück,et al.  On Jones-Optimal Specializers: A Case Study Using Unmix , 2006, APLAS.

[19]  Boniface Hicks,et al.  Dynamic updating of information-flo w policies , 2005 .

[20]  K. Rustan M. Leino,et al.  A semantic approach to secure information flow , 2000, Sci. Comput. Program..

[21]  Shin-ya Katsumata,et al.  Proof-Directed De-compilation of Low-Level Code , 2001, ESOP.

[22]  Michael Stepp,et al.  Equality saturation: a new approach to optimization , 2009, POPL '09.

[23]  Gernot Heiser,et al.  Hype and Virtue , 2007, HotOS.

[24]  Isabella Mastroeni,et al.  The PER Model of Abstract Non-interference , 2005, SAS.

[25]  Nicholas Nethercote,et al.  Valgrind: a framework for heavyweight dynamic binary instrumentation , 2007, PLDI '07.

[26]  Andrew Lumsdaine,et al.  Guaranteed Optimization: Proving Nullspace Properties of Compilers , 2002, SAS.

[27]  Martín Abadi,et al.  A core calculus of dependency , 1999, POPL '99.

[28]  Robert Cartwright,et al.  Soft typing , 1991, PLDI '91.

[29]  Paul Hudak,et al.  Semantics directed program execution monitoring , 1995, Journal of Functional Programming.

[30]  Jeremy G. Siek Gradual Typing for Functional Languages , 2006 .

[31]  Somesh Jha,et al.  A semantics-based approach to malware detection , 2007, POPL '07.

[32]  Angela Demke Brown,et al.  Mixed mode execution with context threading , 2005, CASCON.

[33]  Olivier Danvy,et al.  Tagging, Encoding, and Jones Optimality , 2003, ESOP.

[34]  Philip Wadler,et al.  Theorems for free! , 1989, FPCA.

[35]  Martín Abadi,et al.  Dynamic typing in polymorphic languages , 1995, Journal of Functional Programming.

[36]  Olivier Danvy,et al.  A Symmetric Approach to Compilation and Decompilation , 2002, The Essence of Computation.

[37]  David Sands,et al.  Declassification: Dimensions and principles , 2009, J. Comput. Secur..

[38]  Zhong Shao,et al.  Safe and Principled Language Interoperation , 1999, ESOP.

[39]  James E. Smith,et al.  Virtual machines - versatile platforms for systems and processes , 2005 .

[40]  Neil D. Jones,et al.  Challenging problems in partial evaluation and mixed computation , 1988, New Generation Computing.

[41]  Guy L. Steele,et al.  Building interpreters by composing monads , 1994, POPL '94.

[42]  Doug Simon,et al.  Assembly to high-level language translation , 1998, Proceedings. International Conference on Software Maintenance (Cat. No. 98CB36272).

[43]  Alan Mycroft,et al.  Type-Based Decompilation (or Program Reconstruction via Type Reconstruction) , 1999, ESOP.

[44]  Philip Wadler,et al.  The essence of functional programming , 1992, POPL '92.

[45]  Flemming Nielson,et al.  Principles of Program Analysis , 1999, Springer Berlin Heidelberg.

[46]  Alan Mycroft,et al.  Redux: A Dynamic Dataflow Tracer , 2003, RV@CAV.

[47]  Jacques Carette,et al.  Finally tagless, partially evaluated: Tagless staged interpreters for simpler typed languages , 2007, Journal of Functional Programming.

[48]  Mayer Goldberg Gödelization in the lambda calculus , 2000, Inf. Process. Lett..

[49]  Jeffrey S. Foster,et al.  Checking type safety of foreign function calls , 2005, PLDI '05.

[50]  Frank Pfenning,et al.  Dependent types in practical programming , 1999, POPL '99.

[51]  Mitchell Wand,et al.  The mystery of the tower revealed: A nonreflective description of the reflective tower , 1988, LISP Symb. Comput..

[52]  Robert Bruce Findler,et al.  Fine-grained interoperability through mirrors and contracts , 2005, OOPSLA '05.

[53]  Martín Abadi,et al.  Protection in Programming-Language Translations , 1998, ICALP.

[54]  Mitchell Wand,et al.  The Theory of Fexprs is Trivial , 1998, LISP Symb. Comput..

[55]  David Gries,et al.  Compiler Construction for Digital Computers , 1971 .

[56]  Bryan Cantrill,et al.  Dynamic Instrumentation of Production Systems , 2004, USENIX Annual Technical Conference, General Track.

[57]  Paul Hudak,et al.  Monad transformers and modular interpreters , 1995, POPL '95.

[58]  George C. Necula,et al.  The design and implementation of a certifying compiler , 1998, PLDI.

[59]  Peter Ferrie Attacks on Virtual Machine Emulators , 2007 .

[60]  Eugenio Moggi,et al.  Computational lambda-calculus and monads , 1989, [1989] Proceedings. Fourth Annual Symposium on Logic in Computer Science.

[61]  Roberto Giacobazzi,et al.  Abstract non-interference: parameterizing non-interference by abstract interpretation , 2004, POPL.

[62]  Gavin M. Bierman,et al.  Mutatis Mutandis: Safe and predictable dynamic software updating , 2007, TOPL.

[63]  Eric Eide,et al.  Volatiles are miscompiled, and what to do about it , 2008, EMSOFT '08.

[64]  Davide Sangiorgi,et al.  Communicating and Mobile Systems: the π-calculus, , 2000 .

[65]  Roberto Giacobazzi,et al.  Semantic-Based Code Obfuscation by Abstract Interpretation , 2005, ICALP.

[66]  Robert Harper,et al.  Compiling polymorphism using intensional type analysis , 1995, POPL '95.

[67]  Stephen J. Fink,et al.  Design, implementation and evaluation of adaptive recompilation with on-stack replacement , 2003, International Symposium on Code Generation and Optimization, 2003. CGO 2003..

[68]  David Walker,et al.  Dynamic Typing with Dependent Types , 2004, IFIP TCS.

[69]  Aske Simon Christensen,et al.  Precise Analysis of String Expressions , 2003, SAS.

[70]  Cormac Flanagan,et al.  Hybrid type checking , 2006, POPL '06.

[71]  Cynthia E. Irvine,et al.  Analysis of the Intel Pentium's Ability to Support a Secure Virtual Machine Monitor , 2000, USENIX Security Symposium.

[72]  Simon L. Peyton Jones,et al.  Finding the needle: stack traces for GHC , 2009, Haskell.

[73]  Samuel N. Kamin,et al.  Modular compilers based on monad transformers , 1998, Proceedings of the 1998 International Conference on Computer Languages (Cat. No.98CB36225).

[74]  Brian Cantwell Smith,et al.  Reflection and semantics in LISP , 1984, POPL.

[75]  Robin Milner,et al.  Communicating and mobile systems - the Pi-calculus , 1999 .

[76]  Eugenio Moggi,et al.  Notions of Computation and Monads , 1991, Inf. Comput..

[77]  Gilad Bracha,et al.  Mirrors: design principles for meta-level facilities of object-oriented programming languages , 2004, OOPSLA.

[78]  Frederick B. Cohen,et al.  Operating system protection through program evolution , 1993, Comput. Secur..

[79]  Sebastian Hunt PERs Generalise Projections for Strictness Analysis (Extended Abstract) , 1990, Functional Programming.

[80]  Simon L. Peyton Jones,et al.  Dynamic typing as staged type inference , 1998, POPL '98.

[81]  Alan Mycroft,et al.  Jones optimality and hardware virtualization: a report on work in progress , 2008, PEPM '08.

[82]  Bennet S. Yee,et al.  Native Client: A Sandbox for Portable, Untrusted x86 Native Code , 2009, 2009 30th IEEE Symposium on Security and Privacy.

[83]  Nick Benton,et al.  Formalizing and verifying semantic type soundness of a simple compiler , 2007, PPDP '07.

[84]  Amit Sahai,et al.  On the (im)possibility of obfuscating programs , 2001, JACM.

[85]  Charles H. Davidson Source program , 2003 .

[86]  Robert Bruce Findler,et al.  Contracts as Pairs of Projections , 2006, FLOPS.

[87]  Olivier Danvy A Rational Deconstruction of Landin's SECD Machine , 2004, IFL.

[88]  Stephanie Weirich,et al.  Generalizing parametricity using information-flow , 2005, 20th Annual IEEE Symposium on Logic in Computer Science (LICS' 05).

[89]  David Clark,et al.  Quantitative Information Flow, Relations and Polymorphic Types , 2005, J. Log. Comput..

[90]  Walid Taha,et al.  Gradual Typing for Objects , 2007, ECOOP.

[91]  Henning Makholm,et al.  On Jones-Optimal Specialization for Strongly Typed Languages , 2000, SAIG.

[92]  Carl A. Gunter Semantics of programming languages: structures and techniques , 1993, Choice Reviews Online.

[93]  Andrew W. Appel,et al.  Modern Compiler Implementation in ML , 1997 .

[94]  Lennart Augustsson,et al.  Cayenne—a language with dependent types , 1998, ICFP '98.

[95]  John C. Reynolds Definitional Interpreters for Higher-Order Programming Languages , 1998, High. Order Symb. Comput..

[96]  Matthias Felleisen,et al.  Contracts for higher-order functions , 2002, ICFP '02.

[97]  Robert Bruce Findler,et al.  Operational semantics for multi-language programs , 2007, POPL '07.

[98]  Andrew C. Myers,et al.  Robust declassification , 2001, Proceedings. 14th IEEE Computer Security Foundations Workshop, 2001..

[99]  Mitchell Wand,et al.  Reification: Reflection without metaphysics , 1984, LFP '84.

[100]  Philip Wadler,et al.  Projections for strictness analysis , 1987, FPCA.

[101]  Kathryn S. McKinley,et al.  Dynamic software updates: a VM-centric approach , 2009, PLDI '09.

[102]  Neil D. Jones,et al.  Transformation by interpreter specialisation , 2004, Sci. Comput. Program..

[103]  Nick Benton,et al.  Simple relational correctness proofs for static analyses and program transformations , 2004, POPL.

[104]  John C. Mitchell On Abstraction and the Expressive Power of Programming Languages , 1991, Sci. Comput. Program..

[105]  Marc F. Witteman,et al.  Reverse Engineering Java Card Applets Using Power Analysis , 2007, WISTP.

[106]  David Sands,et al.  A Per Model of Secure Information Flow in Sequential Programs , 1999, High. Order Symb. Comput..

[107]  Chi-Keung Luk,et al.  PinOS: a programmable framework for whole-system dynamic instrumentation , 2007, VEE '07.

[108]  Søren Debois Imperative-program transformation by instrumented-interpreter specialization , 2008, High. Order Symb. Comput..

[109]  Gavin M. Bierman,et al.  Dynamic rebinding for marshalling and update, with destruct-time ? , 2003, ICFP '03.

[110]  Olivier Danvy,et al.  From Interpreter to Compiler and Virtual Machine: A Functional Derivation , 2003 .

[111]  Walid Taha,et al.  Multi-stage programming with explicit annotations , 1997 .

[112]  Tal Garfinkel,et al.  Compatibility Is Not Transparency: VMM Detection Myths and Realities , 2007, HotOS.

[113]  David Detlefs,et al.  Inlining of Virtual Methods , 1999, ECOOP.

[114]  Glynn Winskel,et al.  The formal semantics of programming languages - an introduction , 1993, Foundation of computing series.

[115]  Fred B. Schneider,et al.  A Language-Based Approach to Security , 2001, Informatics.

[116]  Michael Hicks,et al.  Contextual effects for version-consistent dynamic software updating and safe concurrent programming , 2008, POPL '08.

[117]  George C. Necula,et al.  Analysis of Low-Level Code Using Cooperating Decompilers , 2006, SAS.

[118]  Zhong Shao Typed common intermediate format , 2000, SOEN.

[119]  John Whaley,et al.  System Checkpointing Using Reflection and Program Analysis , 2001, Reflection.

[120]  Gil Neiger,et al.  IntelŴVirtualization Technology: Hardware Support for Efficient Processor Virtualization , 2006 .

[121]  Dan S. Wallach,et al.  Understanding Java stack inspection , 1998, Proceedings. 1998 IEEE Symposium on Security and Privacy (Cat. No.98CB36186).

[122]  Nick Benton,et al.  Under Consideration for Publication in J. Functional Programming Embedded Interpreters , 2022 .

[123]  James R. Larus,et al.  Singularity: rethinking the software stack , 2007, OPSR.

[124]  K. Thompson Reflections on trusting trust , 1984, CACM.

[125]  Laurie Hendren,et al.  Decompiling Java Bytecode: Problems, Traps and Pitfalls , 2002, CC.

[126]  Adi Shamir,et al.  Playing "Hide and Seek" with Stored Keys , 1999, Financial Cryptography.

[127]  Andrew Kennedy Securing the .NET programming model , 2006, Theor. Comput. Sci..

[128]  Andrew C. Myers,et al.  Enforcing Robust Declassification and Qualified Robustness , 2006, J. Comput. Secur..

[129]  Philip Wadler,et al.  Well-Typed Programs Can't Be Blamed , 2009, ESOP.

[130]  Dennis M. Volpano Safety versus Secrecy , 1999, SAS.

[131]  Angela Demke Brown,et al.  Context threading: a flexible and efficient dispatch technique for virtual machine interpreters , 2005, International Symposium on Code Generation and Optimization.

[132]  John C. Reynolds,et al.  Separation logic: a logic for shared mutable data structures , 2002, Proceedings 17th Annual IEEE Symposium on Logic in Computer Science.

[133]  Ole Agesen,et al.  A comparison of software and hardware techniques for x86 virtualization , 2006, ASPLOS XII.

[134]  Peter Sestoft,et al.  Partial evaluation and automatic program generation , 1993, Prentice Hall international series in computer science.

[135]  G. Winskel The formal semantics of programming languages , 1993 .

[136]  Andrew W. Appel,et al.  Foundational proof-carrying code , 2001, Proceedings 16th Annual IEEE Symposium on Logic in Computer Science.

[137]  Peter Sestoft,et al.  Replacing function parameters by global variables , 1989, FPCA.

[138]  Norman Ramsey Embedding an interpreted language using higher-order functions and types , 2011, J. Funct. Program..

[139]  Elvira Albert,et al.  Improving the Decompilation of Java Bytecode to Prolog by Partial Evaluation , 2007, Bytecode@ETAPS.

[140]  David Sands,et al.  Binding time analysis: a new PERspective , 1991, PEPM '91.

[141]  Jeffrey S. Foster,et al.  Polymorphic Type Inference for the JNI , 2006, ESOP.

[142]  M. Frans Kaashoek,et al.  Ksplice: automatic rebootless kernel updates , 2009, EuroSys '09.

[143]  Martín Abadi,et al.  Dynamic typing in a statically-typed language , 1989, POPL '89.

[144]  R. Hookway DIGITAL FX!32 running 32-Bit x86 applications on Alpha NT , 1997, Proceedings IEEE COMPCON 97. Digest of Papers.

[145]  David Gregg,et al.  Optimizing indirect branch prediction accuracy in virtual machine interpreters , 2003, PLDI '03.

[146]  Christopher Krügel,et al.  Detecting System Emulators , 2007, ISC.

[147]  George C. Necula,et al.  Proof-carrying code , 1997, POPL '97.

[148]  Craig Chambers,et al.  Debugging optimized code with dynamic deoptimization , 1992, PLDI '92.

[149]  Andrew W. Appel,et al.  A Debugger for Standard ML , 1995, Journal of Functional Programming.