Formulog: Datalog for SMT-based static analysis

Satisfiability modulo theories (SMT) solving has become a critical part of many static analyses, including symbolic execution, refinement type checking, and model checking. We propose Formulog, a domain-specific language that makes it possible to write a range of SMT-based static analyses in a way that is both close to their formal specifications and amenable to high-level optimizations and efficient evaluation. Formulog extends the logic programming language Datalog with a first-order functional language and mechanisms for representing and reasoning about SMT formulas; a novel type system supports the construction of expressive formulas, while ensuring that neither normal evaluation nor SMT solving goes wrong. Our case studies demonstrate that a range of SMT-based analyses can naturally and concisely be encoded in Formulog, and that — thanks to this encoding — high-level Datalog-style optimizations can be automatically and advantageously applied to these analyses.

[1]  Alberto Griggio,et al.  Software Model Checking via IC3 , 2012, CAV.

[2]  Andrew D. Gordon,et al.  Semantic subtyping with an SMT solver , 2010, ICFP '10.

[3]  Peter J. Stuckey,et al.  Logic programming with satisfiability , 2008, Theory Pract. Log. Program..

[4]  Eelco Visser,et al.  Scopes as types , 2018, Proc. ACM Program. Lang..

[5]  Patrick Cousot,et al.  Abstract interpretation: a unified lattice model for static analysis of programs by construction or approximation of fixpoints , 1977, POPL.

[6]  Nikolaj Bjørner,et al.  Z3: An Efficient SMT Solver , 2008, TACAS.

[7]  François Bancilhon,et al.  Naive Evaluation of Recursively Defined Relations , 1986, On Knowledge Base Management Systems.

[8]  Emir Pasalic,et al.  Design and Implementation of the LogicBlox System , 2015, SIGMOD Conference.

[9]  Monica S. Lam,et al.  Cloning-based context-sensitive pointer alias analysis using binary decision diagrams , 2004, PLDI '04.

[10]  Raghu Ramakrishnan,et al.  Review - Magic Sets and Other Strange Ways to Implement Logic Programs , 1999, ACM SIGMOD Digit. Rev..

[11]  V. S. Subrahmanian,et al.  Maintaining views incrementally , 1993, SIGMOD Conference.

[12]  Viktor Kuncak,et al.  Scala to the Power of Z3: Integrating SMT and Programming , 2011, CADE.

[13]  Nirav Dave,et al.  Smten: Automatic Translation of High-Level Symbolic Computations into SMT Queries , 2013, CAV.

[14]  Adrian Walker,et al.  Towards a Theory of Declarative Knowledge , 1988, Foundations of Deductive Databases and Logic Programming..

[15]  Thomas W. Reps,et al.  Demand Interprocedural Program Analysis Using Logic Databases , 1993, Workshop on Programming with Logic Databases , ILPS.

[16]  Miroslaw Truszczynski,et al.  Answer set programming at a glance , 2011, Commun. ACM.

[17]  Nikolaj Bjørner,et al.  Generalized Property Directed Reachability , 2012, SAT.

[18]  Cormac Flanagan,et al.  Automatic software model checking via constraint logic , 2004, Sci. Comput. Program..

[19]  Patrick Maxim Rondon,et al.  Liquid types , 2008, PLDI '08.

[20]  Jorge A. Navas,et al.  The SeaHorn Verification Framework , 2015, CAV.

[21]  Michael Hanus,et al.  Functional logic programming , 2010, CACM.

[22]  Yannis Smaragdakis,et al.  Using Datalog for Fast and Easy Program Analysis , 2010, Datalog.

[23]  Koushik Sen,et al.  Symbolic execution for software testing: three decades later , 2013, CACM.

[24]  Isil Dillig,et al.  An overview of the saturn project , 2007, PASTE '07.

[25]  Michael Peyton Jones,et al.  QL: Object-oriented Queries on Relational Data , 2016, ECOOP.

[26]  Vikram S. Adve,et al.  LLVM: a compilation framework for lifelong program analysis & transformation , 2004, International Symposium on Code Generation and Optimization, 2004. CGO 2004..

[27]  Isil Dillig,et al.  Bottom-Up Context-Sensitive Pointer Analysis for Java , 2015, APLAS.

[28]  Laurent Fribourg,et al.  Symbolic Verification with Gap-Order Constraints , 1996, LOPSTR.

[29]  Eelco Visser,et al.  The spoofax language workbench: rules for declarative specification of languages and IDEs , 2010, OOPSLA.

[30]  Nicolas Beldiceanu,et al.  Constraint Logic Programming , 1997 .

[31]  Neil D. Jones,et al.  The size-change principle for program termination , 2001, POPL '01.

[32]  Michael Arntzenius,et al.  Seminaïve evaluation for a higher-order functional language , 2019, Proc. ACM Program. Lang..

[33]  Till Westmann,et al.  On fast large-scale program analysis in Datalog , 2016, CC.

[34]  Catriel Beeri,et al.  On the power of magic , 1987, J. Log. Program..

[35]  Robert A. Kowalski,et al.  Algorithm = logic + control , 1979, CACM.

[36]  Matthias Felleisen,et al.  Semantics Engineering with PLT Redex , 2009 .

[37]  Gopalan Nadathur,et al.  A Logic Programming Approach to Manipulating Formulas and Programs , 1987, SLP.

[38]  Nikolaj Bjørner,et al.  Horn Clause Solvers for Program Verification , 2015, Fields of Logic and Computation II.

[39]  Dawson R. Engler,et al.  KLEE: Unassisted and Automatic Generation of High-Coverage Tests for Complex Systems Programs , 2008, OSDI.

[40]  Giorgio Delzanno,et al.  Model Checking in CLP , 1999, TACAS.

[41]  David W. Binkley,et al.  Program slicing , 2008, 2008 Frontiers of Software Maintenance.

[42]  Bernhard Scholz,et al.  Soufflé: On Synthesis of Program Analyzers , 2016, CAV.

[43]  Alan van Gelser Negation as failure using tight derivations for general logic programs , 1989 .

[44]  Benjamin Livshits,et al.  GATEKEEPER: Mostly Static Enforcement of Security and Reliability Policies for JavaScript Code , 2009, USENIX Security Symposium.

[45]  V. S. Costa,et al.  Theory and Practice of Logic Programming , 2010 .

[46]  Eric M. Schulte,et al.  Datalog Disassembly , 2019, USENIX Security Symposium.

[47]  Benjamin Livshits,et al.  Finding Security Vulnerabilities in Java Applications with Static Analysis , 2005, USENIX Security Symposium.

[48]  Jack Minker,et al.  Logic and Data Bases , 1978, Springer US.

[49]  Petar Tsankov,et al.  Securify: Practical Security Analysis of Smart Contracts , 2018, CCS.

[50]  Kenneth L. McMillan,et al.  Lazy Abstraction with Interpolants , 2006, CAV.

[51]  Emina Torlak,et al.  Growing solver-aided languages with rosette , 2013, Onward!.

[52]  Bernhard Scholz,et al.  A specialized B-tree for concurrent datalog evaluation , 2019, PPoPP.

[53]  David Detlefs,et al.  Simplify: a theorem prover for program checking , 2005, JACM.

[54]  Yannis Smaragdakis,et al.  MadMax: surviving out-of-gas conditions in Ethereum smart contracts , 2018, Proc. ACM Program. Lang..

[55]  Mukund Raghothaman,et al.  Provenance-guided synthesis of Datalog programs , 2019, Proc. ACM Program. Lang..

[56]  Monica S. Lam,et al.  Using Datalog with Binary Decision Diagrams for Program Analysis , 2005, APLAS.

[57]  Teodor C. Przymusinski On the Declarative Semantics of Deductive Databases and Logic Programs , 1988, Foundations of Deductive Databases and Logic Programming..

[58]  Frank Pfenning,et al.  Higher-order abstract syntax , 1988, PLDI '88.

[59]  Michael Arntzenius,et al.  Datafun: a functional Datalog , 2016, ICFP.

[60]  Stephen Chong,et al.  Formulog: Datalog for SMT-Based Static Analysis (Extended Version) , 2020, ArXiv.

[61]  Allen Van Gelder,et al.  Negation as Failure using Tight Derivations for General Logic Programs , 1988, J. Log. Program..

[62]  David Maier,et al.  Magic sets and other strange ways to implement logic programs (extended abstract) , 1985, PODS '86.

[63]  Aws Albarghouthi,et al.  Constraint-Based Synthesis of Datalog Programs , 2017, CP.

[64]  Andrey Rybalchenko,et al.  Synthesizing software verifiers from proof rules , 2012, PLDI.

[65]  James C. King,et al.  Symbolic execution and program testing , 1976, CACM.

[66]  Sebastian Erdweg,et al.  Incrementalizing lattice-based program analyses in Datalog , 2018, Proc. ACM Program. Lang..

[67]  Daniel Kroening,et al.  A Tool for Checking ANSI-C Programs , 2004, TACAS.

[68]  Yannis Smaragdakis,et al.  Gigahorse: Thorough, Declarative Decompilation of Smart Contracts , 2019, 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE).

[69]  Michael J. Maher,et al.  Constraint Logic Programming: A Survey , 1994, J. Log. Program..

[70]  Mats Carlsson,et al.  SICStus Prolog—The first 25 years , 2010, Theory and Practice of Logic Programming.

[71]  Andreas Podelski,et al.  ARMC: The Logical Choice for Software Model Checking with Abstraction Refinement , 2007, PADL.

[72]  Bruno Dutertre,et al.  Yices 2.2 , 2014, CAV.

[73]  William Craig,et al.  Three uses of the Herbrand-Gentzen theorem in relating model theory and proof theory , 1957, Journal of Symbolic Logic.

[74]  Shan Shan Huang,et al.  Datalog and Recursive Query Processing , 2013, Found. Trends Databases.

[75]  Nikolaj Bjørner,et al.  μZ- An Efficient Engine for Fixed Points with Constraints , 2011, CAV.

[76]  Yannis Smaragdakis,et al.  Strictly declarative specification of sophisticated points-to analyses , 2009, OOPSLA '09.

[77]  Ondrej Lhoták,et al.  From Datalog to flix: a declarative language for fixed points on lattices , 2016, PLDI.

[78]  Thomas A. Henzinger,et al.  Lazy abstraction , 2002, POPL '02.