Program synthesis by sketching

The goal of software synthesis is to generate programs automatically from high-level specifications. However, efficient implementations for challenging programs require a combination of high-level algorithmic insights and low-level implementation details. Deriving the low-level details is a natural job for a computer, but the synthesizer can not replace the human insight. Therefore, one of the central challenges for software synthesis is to establish a synergy between the programmer and the synthesizer, exploiting the programmer's expertise to reduce the burden on the synthesizer. This thesis introduces sketching, a new style of synthesis that offers a fresh approach to the synergy problem. Previous approaches have relied on meta-programming, or variations of interactive theorem proving to help the synthesizer deduce an efficient implementation. The resulting systems are very powerful, but they require the programmer to master new formalisms far removed from traditional programming models. To make synthesis accessible, programmers must be able to provide their insight effortlessly, using formalisms they already understand. In Sketching, insight is communicated through a partial program, a sketch that expresses the high-level structure of an implementation but leaves holes in place of the low-level details. This form of synthesis is made possible by a new SAT-based inductive synthesis procedure that can efficiently synthesize an implementation from a small number of test cases. This algorithm forms the core of a new counterexample guided inductive synthesis procedure (CEGIS) which combines the inductive synthesizer with a validation procedure to automatically generate test inputs and ensure that the generated program satisfies its specification. With a few extensions, CEGIS can even use its sequential inductive synthesizer to generate concurrent programs; all the concurrency related reasoning is delegated to an off-the-shelf validation procedure. The resulting synthesis system scales to real programming problems from a variety of domains ranging from bit-level ciphers to manipulations of linked datastructures. The system was even used to produce a complete optimized implementation of the AES cipher. The concurrency aware synthesizer was also used to synthesize, in a matter of minutes, the details of a fine-locking scheme for a concurrent set, a sense reversing barrier, and even a solution to the dining philosophers problem. The system was also extended with domain specific knowledge to better handle the problem of implementing stencil computations, an important domain in scientific computing. For this domain, we were able to encode domain specific insight as a problem reduction that converted stencil sketches into simplified sketch problems which CEGIS resolved in a matter of minutes. This specialized synthesizer was used to quickly implement a MultiGrid solver for partial differential equations containing many difficult implementation strategies from the literature. In short, this thesis shows that sketching is a viable approach to making synthesis practical in a general programming context.

[1]  Daniel Kroening,et al.  Behavioral consistency of C and Verilog programs using bounded model checking , 2003, Proceedings 2003. Design Automation Conference (IEEE Cat. No.03CH37451).

[2]  JhaSomesh,et al.  Counterexample-guided abstraction refinement for symbolic model checking , 2003 .

[3]  Alexander Aiken,et al.  Scalable error detection using boolean satisfiability , 2005, POPL '05.

[4]  Carsten K. Gomard A self-applicable partial evaluator for the lambda calculus: correctness and pragmatics , 1992, TOPL.

[5]  Henry S. Warren,et al.  Hacker's Delight , 2002 .

[6]  William Thies,et al.  StreamIt: A Language for Streaming Applications , 2002, CC.

[7]  Kenneth L. McMillan,et al.  Symbolic model checking , 1992 .

[8]  Wilhelm Ackermann,et al.  Solvable Cases Of The Decision Problem , 1954 .

[9]  Amir Pnueli,et al.  The Code Validation Tool (CVT) , 1998, International Journal on Software Tools for Technology Transfer (STTT).

[10]  Gadi Taubenfeld,et al.  Automatic discovery of mutual exclusion algorithms , 2003, PODC '03.

[11]  Joël Ouaknine,et al.  Deciding Bit-Vector Arithmetic with Abstraction , 2007, TACAS.

[12]  Eran Yahav,et al.  Correctness-preserving derivation of concurrent garbage collection algorithms , 2006, PLDI '06.

[13]  William L. Briggs,et al.  A multigrid tutorial , 1987 .

[14]  Edmund M. Clarke,et al.  Design and Synthesis of Synchronization Skeletons Using Branching Time Temporal Logic , 2008, 25 Years of Model Checking.

[15]  Amir Pnueli,et al.  Distributed reactive systems are hard to synthesize , 1990, Proceedings [1990] 31st Annual Symposium on Foundations of Computer Science.

[16]  Patrice Godefroid,et al.  Partial-Order Methods for the Verification of Concurrent Systems , 1996, Lecture Notes in Computer Science.

[17]  Robert K. Brayton,et al.  DAG-aware AIG rewriting: a fresh look at combinational logic synthesis , 2006, 2006 43rd ACM/IEEE Design Automation Conference.

[18]  Samuel Williams,et al.  Implicit and explicit optimizations for stencil computations , 2006, MSPC '06.

[19]  James Demmel,et al.  Optimizing matrix multiply using PHiPAC: a portable, high-performance, ANSI C coding methodology , 1997, ICS '97.

[20]  Peter Sestoft,et al.  Partial evaluation and automatic program generation , 1993, Prentice Hall international series in computer science.

[21]  D. M. Hutton,et al.  The Art of Multiprocessor Programming , 2008 .

[22]  Erik Sandewall,et al.  A Partial Evaluator, and its Use as a Programming Tool , 1976, Artif. Intell..

[23]  Randal E. Bryant,et al.  Processor verification using efficient reductions of the logic of uninterpreted functions to propositional logic , 1999, TOCL.

[24]  David L. Dill,et al.  Decision procedures for bit-vectors, arrays and integers , 2007 .

[25]  Sanjit A. Seshia,et al.  Sketching stencils , 2007, PLDI '07.

[26]  Eran Yahav,et al.  Deriving linearizable fine-grained concurrent objects , 2008, PLDI '08.

[27]  Eran Yahav,et al.  CGCExplorer: a semi-automated search procedure for provably correct concurrent collectors , 2007, PLDI '07.

[28]  Abraham Silberschatz,et al.  Operating System Concepts , 1983 .

[29]  Maurice Herlihy,et al.  A Lazy Concurrent List-Based Set Algorithm , 2007, Parallel Process. Lett..

[30]  David Andre,et al.  Programmable Reinforcement Learning Agents , 2000, NIPS.

[31]  Mikhail A. Bulyonkov Polyvariant mixed computation for analyzer programs , 2004, Acta Informatica.

[32]  R. van Renesse,et al.  An experiment in formal design using meta-properties , 2001, Proceedings DARPA Information Survivability Conference and Exposition II. DISCEX'01.

[33]  Olivier Danvy,et al.  Tutorial notes on partial evaluation , 1993, POPL '93.

[34]  Klaus Havelund,et al.  Model Checking Programs , 2004, Automated Software Engineering.

[35]  G. Roth,et al.  Compiling Stencils in High Performance Fortran , 1997, ACM/IEEE SC 1997 Conference (SC'97).

[36]  Zohar Manna,et al.  A Deductive Approach to Program Synthesis , 1979, TOPL.

[37]  Johann Schumann,et al.  Under Consideration for Publication in J. Functional Programming Autobayes: a System for Generating Data Analysis Programs from Statistical Models , 2022 .

[38]  Sérgio Vale Aguiar Campos,et al.  Symbolic Model Checking , 1993, CAV.

[39]  Leonid Oliker,et al.  Impact of modern memory subsystems on cache optimizations for stencil computations , 2005, MSP '05.

[40]  Eran Yahav,et al.  Comparison Under Abstraction for Verifying Linearizability , 2007, CAV.

[41]  Christopher Strachey,et al.  Toward a mathematical semantics for computer languages , 1971 .

[42]  Antoni W. Mazurkiewicz,et al.  Basic notions of trace theory , 1988, REX Workshop.

[43]  David G. Wonnacott,et al.  Achieving Scalable Locality with Time Skewing , 2002, International Journal of Parallel Programming.

[44]  Maurice Herlihy,et al.  Linearizability: a correctness condition for concurrent objects , 1990, TOPL.

[45]  Volker Strumpen,et al.  The memory behavior of cache oblivious stencil computations , 2007, The Journal of Supercomputing.

[46]  Debra Hensgen,et al.  Two algorithms for barrier synchronization , 1988, International Journal of Parallel Programming.

[47]  Leslie Lamport,et al.  How to Make a Multiprocessor Computer That Correctly Executes Multiprocess Programs , 2016, IEEE Transactions on Computers.

[48]  Gerard J. Holzmann,et al.  The Model Checker SPIN , 1997, IEEE Trans. Software Eng..

[49]  Steven G. Johnson,et al.  FFTW: an adaptive software architecture for the FFT , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[50]  Douglas R. Smith,et al.  KIDS: A Semiautomatic Program Development System , 1990, IEEE Trans. Software Eng..

[51]  T. Emerson,et al.  Development of a constraint-based airlift scheduler by program synthesis from formal specifications , 1999, 14th IEEE International Conference on Automated Software Engineering.

[52]  Sriram K. Rajamani,et al.  The SLAM project: debugging system software via static analysis , 2002, POPL '02.

[53]  Fahiem Bacchus,et al.  Binary Clause Reasoning in QBF , 2006, SAT.

[54]  Carl H. Smith,et al.  Inductive Inference: Theory and Methods , 1983, CSUR.

[55]  Ulrich Rüde,et al.  Cache Optimization for Structured and Unstructured Grid Multigrid , 2000 .

[56]  Drew McDermott,et al.  Derivation of Glue Code for Agent Interoperation , 2000, AGENTS '00.

[57]  Armin Biere,et al.  Resolve and Expand , 2004, SAT.

[58]  Daniel Kroening,et al.  A Tool for Checking ANSI-C Programs , 2004, TACAS.

[59]  A. Pnueli,et al.  On the Synthesis of an Asynchronous Reactive Module , 1989, ICALP.

[60]  Rance Cleaveland,et al.  Implementing mathematics with the Nuprl proof development system , 1986 .

[61]  Zohar Manna,et al.  Synthesis: Dreams - Programs , 1979, IEEE Trans. Software Eng..

[62]  Armando Solar-Lezama,et al.  Programming by sketching for bit-streaming programs , 2005, PLDI '05.

[63]  Ehud Shapiro,et al.  Algorithmic Program Debugging , 1983 .

[64]  Alan J. Hu,et al.  Embedded Software Verification Using Symbolic Execution and Uninterpreted Functions , 2006, International Journal of Parallel Programming.