The S2E Platform: Design, Implementation, and Applications

This article presents S2E, a platform for analyzing the properties and behavior of software systems, along with its use in developing tools for comprehensive performance profiling, reverse engineering of proprietary software, and automated testing of kernel-mode and user-mode binaries. Conceptually, S2E is an automated path explorer with modular path analyzers: the explorer uses a symbolic execution engine to drive the target system down all execution paths of interest, while analyzers measure and/or check properties of each such path. S2E users can either combine existing analyzers to build custom analysis tools, or they can directly use S2E’s APIs. S2E’s strength is the ability to scale to large systems, such as a full Windows stack, using two new ideas: selective symbolic execution, a way to automatically minimize the amount of code that has to be executed symbolically given a target analysis, and execution consistency models, a way to make principled performance/accuracy trade-offs during analysis. These techniques give S2E three key abilities: to simultaneously analyze entire families of execution paths instead of just one execution at a time; to perform the analyses in-vivo within a real software stack---user programs, libraries, kernel, drivers, etc.---instead of using abstract models of these layers; and to operate directly on binaries, thus being able to analyze even proprietary software.

[1]  Robert Tappan Morris,et al.  Locating cache performance bottlenecks using data profiling , 2010, EuroSys '10.

[2]  W. E. Weihl,et al.  Efficient and flexible value sampling , 2000, SIGP.

[3]  Barton P. Miller,et al.  An empirical study of the reliability of UNIX utilities , 1990, Commun. ACM.

[4]  Gail E. Kaiser,et al.  Quality Assurance of Software Applications Using the In Vivo Testing Approach , 2009, 2009 International Conference on Software Testing Verification and Validation.

[5]  Dawson R. Engler,et al.  RWset: Attacking Path Explosion in Constraint-Based Test Generation , 2008, TACAS.

[6]  Dawson R. Engler,et al.  EXE: automatically generating inputs of death , 2006, CCS '06.

[7]  Koushik Sen,et al.  DART: directed automated random testing , 2005, PLDI '05.

[8]  Patrice Godefroid,et al.  Compositional dynamic test generation , 2007, POPL '07.

[9]  James C. King A new approach to program testing , 1975 .

[10]  Junfeng Yang,et al.  EXPLODE: a lightweight, general system for finding serious storage system errors , 2006, OSDI '06.

[11]  George Candea,et al.  Testing Closed-Source Binary Device Drivers with DDT , 2010, USENIX Annual Technical Conference.

[12]  Koushik Sen,et al.  Concolic testing , 2007, ASE.

[13]  Vikram S. Adve,et al.  LLVM: a compilation framework for lifelong program analysis & transformation , 2004, International Symposium on Code Generation and Optimization, 2004. CGO 2004..

[14]  George Candea,et al.  Reverse engineering of binary device drivers with RevNIC , 2010, EuroSys '10.

[15]  Patrice Godefroid,et al.  Automated Whitebox Fuzz Testing , 2008, NDSS.

[16]  Brad Chen,et al.  Locating System Problems Using Dynamic Instrumentation , 2010 .

[17]  Haoxiang Lin,et al.  MODIST: Transparent Model Checking of Unmodified Distributed Systems , 2009, NSDI.

[18]  Gregory R. Andrews,et al.  Disassembly of executable code revisited , 2002, Ninth Working Conference on Reverse Engineering, 2002. Proceedings..

[19]  Koushik Sen,et al.  CUTE: a concolic unit testing engine for C , 2005, ESEC/FSE-13.

[20]  Thomas Ball,et al.  Finding and Reproducing Heisenbugs in Concurrent Programs , 2008, OSDI.

[21]  Sriram K. Rajamani,et al.  Thorough static analysis of device drivers , 2006, EuroSys.

[22]  Zhenkai Liang,et al.  BitBlaze: A New Approach to Computer Security via Binary Analysis , 2008, ICISS.

[23]  Dawson R. Engler,et al.  A few billion lines of code later , 2010, Commun. ACM.

[24]  Patrice Godefroid,et al.  Model checking for programming languages using VeriSoft , 1997, POPL '97.

[25]  Ozalp Babaoglu,et al.  ACM Transactions on Computer Systems , 2007 .

[26]  Thomas Ball,et al.  The Static Driver Verifier Research Platform , 2010, CAV.

[27]  James C. King,et al.  A new approach to program testing , 1974, Programming Methodology.

[28]  George Candea,et al.  Parallel symbolic execution for automated real-world software testing , 2011, EuroSys '11.

[29]  George Candea,et al.  S2E: a platform for in-vivo multi-path analysis of software systems , 2011, ASPLOS XVI.

[30]  Lance M. Berc,et al.  Continuous profiling: where have all the cycles gone? , 1997, TOCS.

[31]  Michael Burrows,et al.  Eraser: a dynamic data race detector for multithreaded programs , 1997, TOCS.

[32]  Michael R. Lowry,et al.  Combining unit-level symbolic execution and system-level concrete execution for testing nasa software , 2008, ISSTA '08.

[33]  Dawson R. Engler,et al.  KLEE: Unassisted and Automatic Generation of High-Coverage Tests for Complex Systems Programs , 2008, OSDI.

[34]  Matt T. Yourst PTLsim: A Cycle Accurate Full System x86-64 Microarchitectural Simulator , 2007, 2007 IEEE International Symposium on Performance Analysis of Systems & Software.

[35]  Fabrice Bellard,et al.  QEMU, a Fast and Portable Dynamic Translator , 2005, USENIX ATC, FREENIX Track.

[36]  Harish Patil,et al.  Pin: building customized program analysis tools with dynamic instrumentation , 2005, PLDI '05.

[37]  Isil Dillig,et al.  Sound, complete and scalable path-sensitive analysis , 2008, PLDI '08.

[38]  Benjamin Livshits,et al.  Context-sensitive program analysis as database queries , 2005, PODS.

[39]  Dawson R. Engler,et al.  EXE: automatically generating inputs of death , 2006, CCS '06.

[40]  Lance M. Berc,et al.  Continuous profiling: where have all the cycles gone? , 1997, ACM Trans. Comput. Syst..

[41]  Chi-Keung Luk,et al.  PinOS: a programmable framework for whole-system dynamic instrumentation , 2007, VEE '07.

[42]  Zhenkai Liang,et al.  BitScope: Automatically Dissecting Malicious Binaries , 2007 .