Symbolic execution of multithreaded programs from arbitrary program contexts

We describe an algorithm to perform symbolic execution of a multithreaded program starting from an arbitrary program context. We argue that this can enable more efficient symbolic exploration of deep code paths in multithreaded programs by allowing the symbolic engine to jump directly to program contexts of interest. The key challenge is modeling the initial context with reasonable precision - an overly approximate model leads to exploration of many infeasible paths during symbolic execution, while a very precise model would be so expensive to compute that computing it would defeat the purpose of jumping directly to the initial context in the first place. We propose a context-specific dataflow analysis that approximates the initial context cheaply, but precisely enough to avoid some common causes of infeasible-path explosion. This model is necessarily approximate - it may leave portions of the memory state unconstrained, leaving our symbolic execution unable to answer simple questions such as "which thread holds lock A?". For such cases, we describe a novel algorithm for evaluating symbolic synchronization during symbolic execution. Our symbolic execution semantics are sound and complete up to the limits of the underlying SMT solver. We describe initial experiments on an implementation in Cloud 9.

[1]  Nikolaj Bjørner,et al.  Z3: An Efficient SMT Solver , 2008, TACAS.

[2]  George Candea,et al.  Automated Debugging for Arbitrarily Long Executions , 2013, HotOS.

[3]  Patrice Godefroid,et al.  Automatic partial loop summarization in dynamic test generation , 2011, ISSTA '11.

[4]  Hans-Juergen Boehm,et al.  Extended sequential reasoning for data-race-free programs , 2011, MSPC '11.

[5]  Hans-Juergen Boehm,et al.  Foundations of the C++ concurrency memory model , 2008, PLDI '08.

[6]  Peter Schachte,et al.  State Joining and Splitting for the Symbolic Execution of Binaries , 2009, RV.

[7]  Sorin Lerner,et al.  RELAY: static race detection on millions of lines of code , 2007, ESEC-FSE '07.

[8]  Nikolai Tillmann,et al.  Pex-White Box Test Generation for .NET , 2008, TAP.

[9]  Patrice Godefroid,et al.  Precise pointer reasoning for dynamic test generation , 2009, ISSTA.

[10]  Sarfraz Khurshid,et al.  Generalized Symbolic Execution for Model Checking and Testing , 2003, TACAS.

[11]  Kathryn S. McKinley,et al.  Bounded partial-order reduction , 2013, OOPSLA.

[12]  Vikram S. Adve,et al.  LLVM: a compilation framework for lifelong program analysis & transformation , 2004, International Symposium on Code Generation and Optimization, 2004. CGO 2004..

[13]  Patrice Godefroid,et al.  Automated Whitebox Fuzz Testing , 2008, NDSS.

[14]  Hans-Juergen Boehm Simple garbage-collector-safety , 1996, PLDI '96.

[15]  Koushik Sen,et al.  CUTE: a concolic unit testing engine for C , 2005, ESEC/FSE-13.

[16]  Isil Dillig,et al.  Precise and compact modular procedure summaries for heap manipulating programs , 2011, PLDI '11.

[17]  George Candea,et al.  Parallel symbolic execution for automated real-world software testing , 2011, EuroSys '11.

[18]  Yuan Zhang,et al.  Barrier matching for programs with textually unaligned barriers , 2007, PPoPP.

[19]  Vikram S. Adve,et al.  Macroscopic Data Structure Analysis and Optimization , 2005 .

[20]  Dan Grossman,et al.  Input-covering schedules for multithreaded programs , 2013, OOPSLA.

[21]  Thomas Bergan,et al.  Avoiding State-Space Explosion in Multithreaded Programs with Input-Covering Schedules and Symbolic Execution , 2014 .

[22]  Koushik Sen,et al.  DART: directed automated random testing , 2005, PLDI '05.

[23]  George Candea,et al.  Efficient state merging in symbolic execution , 2012, Software Engineering.

[24]  Jakob Rehof,et al.  Summarizing procedures in concurrent programs , 2004, POPL.

[25]  George Candea,et al.  S2E: a platform for in-vivo multi-path analysis of software systems , 2011, ASPLOS XVI.

[26]  Zhendong Su,et al.  Steering symbolic execution to less traveled paths , 2013, OOPSLA.

[27]  Patrice Godefroid,et al.  Micro execution , 2014, ICSE.

[28]  Alvin Cheung,et al.  Partial replay of long-running applications , 2011, ESEC/FSE '11.

[29]  Dawson R. Engler,et al.  RWset: Attacking Path Explosion in Constraint-Based Test Generation , 2008, TACAS.

[30]  Thomas W. Reps,et al.  Precise interprocedural dataflow analysis via graph reachability , 1995, POPL '95.

[31]  Corina S. Pasareanu,et al.  Symbolic execution with mixed concrete-symbolic solving , 2011, ISSTA '11.

[32]  Shuvendu K. Lahiri,et al.  A Reachability Predicate for Analyzing Low-Level Software , 2007, TACAS.

[33]  Patrice Godefroid,et al.  Compositional Dynamic Test Generation (Extended Abstract) , 2007 .

[34]  David L. Dill,et al.  A Decision Procedure for Bit-Vectors and Arrays , 2007, CAV.

[35]  Patrice Godefroid,et al.  Dynamic partial-order reduction for model checking software , 2005, POPL '05.

[36]  Martin C. Rinard,et al.  Analysis of Multithreaded Programs , 2001, SAS.

[37]  Anoop Gupta,et al.  The SPLASH-2 programs: characterization and methodological considerations , 1995, ISCA.

[38]  Madan Musuvathi,et al.  Iterative context bounding for systematic testing of multithreaded programs , 2007, PLDI '07.

[39]  Michal Moskal,et al.  Heaps and Data Structures: A Challenge for Automated Provers , 2011, CADE.

[40]  Sorin Lerner,et al.  ESP: path-sensitive program verification in polynomial time , 2002, PLDI '02.

[41]  Kai Li,et al.  The PARSEC benchmark suite: Characterization and architectural implications , 2008, 2008 International Conference on Parallel Architectures and Compilation Techniques (PACT).

[42]  Dawson R. Engler,et al.  KLEE: Unassisted and Automatic Generation of High-Coverage Tests for Complex Systems Programs , 2008, OSDI.