Automatic proofs and refutations for higher-order logic

This thesis describes work on two components of the interactive theorem prover Isabelle/HOL. The primary contribution is the development of Nitpick, a counterexample generator that builds on a first-order relational model finder. The second main contribution is the further development of the Sledgehammer proof tool. This tool heuristically selects facts relevant to the conjecture to prove and delegates the problem to first-order resolution provers and SMT solvers.

[1]  Stephan Schulz,et al.  System Description: E 1.8 , 2013, LPAR.

[2]  Lawrence C. Paulson,et al.  Extending Sledgehammer with SMT Solvers , 2011, Journal of Automated Reasoning.

[3]  I. Lakatos PROOFS AND REFUTATIONS (I)*† , 1963, The British Journal for the Philosophy of Science.

[4]  Simon L. Peyton Jones,et al.  The Implementation of Functional Programming Languages , 1987 .

[5]  Johann Schumann,et al.  Automated Theorem Proving in Software Engineering , 2001, Springer Berlin Heidelberg.

[6]  Lawrence C. Paulson,et al.  A fixedpoint approach to (co)inductive and (co)datatype definitions , 2000, Proof, Language, and Interaction.

[7]  Jozef Hooman,et al.  Concurrency Verification: Introduction to Compositional and Noncompositional Methods , 2001, Cambridge Tracts in Theoretical Computer Science.

[8]  Koen Claessen,et al.  Automated Inference of Finite Unsatisfiability , 2009, Journal of Automated Reasoning.

[9]  Frank Pfenning,et al.  Analytic and Non-analytic Proofs , 1984, CADE.

[10]  Panagiotis Manolios,et al.  Integrating Testing and Interactive Theorem Proving , 2011, ACL2.

[11]  Konstantin Korovin Instantiation-Based Automated Reasoning: From Theory to Practice , 2009, CADE.

[12]  Amir Pnueli,et al.  The Small Model Property: How Small Can It Be? , 2002, Inf. Comput..

[13]  Tom Ridge,et al.  The semantics of x86-CC multiprocessor machine code , 2009, POPL '09.

[14]  T. Melham Automating recursive type definitions in higher order logic , 1989 .

[15]  Ingo Dahn,et al.  Integration of Automated and Interactive Theorem Proving in ILP , 1997, CADE.

[16]  Alexander Krauss,et al.  Partial and Nested Recursive Function Definitions in Higher-order Logic , 2010, Journal of Automated Reasoning.

[17]  K. Rustan M. Leino,et al.  A Polymorphic Intermediate Verification Language: Design and Logical Encoding , 2010, TACAS.

[18]  Arne Andersson,et al.  Balanced Search Trees Made Simple , 1993, WADS.

[19]  Sylvain Conchon,et al.  Implementing polymorphism in SMT solvers , 2008, SMT '08/BPR '08.

[20]  Geoff Sutcliffe The CADE-23 Automated Theorem Proving System Competition - CASC-23 , 2012, AI Commun..

[21]  Clark W. Barrett,et al.  Cooperating Theorem Provers: A Case Study Combining HOL-Light and CVC Lite , 2006, Electron. Notes Theor. Comput. Sci..

[22]  Klaus Havelund,et al.  Model Checking Programs , 2004, Automated Software Engineering.

[23]  Yue Yang,et al.  Nemos: a framework for axiomatic and executable specifications of memory consistency models , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..

[24]  Lawrence C. Paulson,et al.  Lightweight relevance filtering for machine-generated resolution problems , 2009, J. Appl. Log..

[25]  Markus Wenzel,et al.  Asynchronous Proof Processing with Isabelle/Scala and Isabelle/jEdit , 2012, UITP.

[26]  Nikolaj Bjørner,et al.  Z3: An Efficient SMT Solver , 2008, TACAS.

[27]  Mark E. Stickel Schubert's Steamroller problem: Formulations and solutions , 2004, Journal of Automated Reasoning.

[28]  Lukas Bulwahn,et al.  Smart Testing of Functional Programs in Isabelle , 2012, LPAR.

[29]  Andrei Popescu,et al.  More SPASS with Isabelle - Superposition with Hard Sorts and Configurable Simplification , 2012, ITP.

[30]  Josef Urban,et al.  MaLARea SG1- Machine Learner for Automated Reasoning with Semantic Guidance , 2008, IJCAR.

[31]  Marcelo F. Frias,et al.  Alloy Analyzer+PVS in the Analysis and Verification of Alloy Specifications , 2007, TACAS.

[32]  Jan Jürjens,et al.  Finite Models in FOL-Based Crypto-Protocol Verification , 2009, ARSPA-WITS.

[33]  Daniel Brand,et al.  Proving Theorems with the Modification Method , 1975, SIAM J. Comput..

[34]  Markus Wenzel,et al.  Type Classes and Overloading in Higher-Order Logic , 1997, TPHOLs.

[35]  Lawrence Charles Paulson,et al.  Isabelle/HOL: A Proof Assistant for Higher-Order Logic , 2002 .

[36]  Tobias Nipkow,et al.  Finding Lexicographic Orders for Termination Proofs in Isabelle/HOL , 2007, TPHOLs.

[37]  Damien Doligez,et al.  Zenon : An Extensible Automated Theorem Prover Producing Checkable Proofs , 2007, LPAR.

[38]  Andreas Meier System Description: TRAMP: Transformation of Machine-Found Proofs into ND-Proofs at the Assertion Level , 2000, CADE.

[39]  Elsa L. Gunter Why we can't have SML-style datatype Declarations in HOL , 1992, TPHOLs.

[40]  Herbert B. Enderton,et al.  A mathematical introduction to logic , 1972 .

[41]  Emina Torlak,et al.  Kodkod: A Relational Model Finder , 2007, TACAS.

[42]  N. S. Barnett,et al.  Private communication , 1969 .

[43]  Jasmin Christian Blanchette,et al.  Proof Pearl: Mechanizing the Textbook Proof of Huffman’s Algorithm , 2009, Journal of Automated Reasoning.

[44]  John C. Mitchell,et al.  Foundations for programming languages , 1996, Foundation of computing series.

[45]  R. J. M. Hughes,et al.  Super-combinators a new implementation method for applicative languages , 1982, LFP '82.

[46]  Tobias Nipkow,et al.  Order-sorted polymorphism in Isabelle , 1993 .

[47]  Bernhard Beckert,et al.  Integrating Automated and Interactive Theorem Proving , 1998 .

[48]  Lawrence C. Paulson,et al.  Generic Automatic Proof Tools , 1997, ArXiv.

[49]  Stephan Schulz,et al.  System Description: E 0.81 , 2004, IJCAR.

[50]  Beverly A. Sanders,et al.  Precise Data Race Detection in a Relaxed Memory Model Using Heuristic-Based Model Checking , 2009, 2009 IEEE/ACM International Conference on Automated Software Engineering.

[51]  Christoph Weidenbach,et al.  Combining Superposition, Sorts and Splitting , 2001, Handbook of Automated Reasoning.

[52]  Tobias Nipkow,et al.  Social Choice Theory in HOL Arrow and Gibbard-Satterthwaite , 2009 .

[53]  Burkhart Wolff,et al.  HOL-Boogie—An Interactive Prover-Backend for the Verifying C Compiler , 2009, Journal of Automated Reasoning.

[54]  Martin Gebser,et al.  An incremental answer set programming based system for finite model computation , 2011, AI Commun..

[55]  Peyton Jones,et al.  Haskell 98 language and libraries : the revised report , 2003 .

[56]  Hans de Nivelle,et al.  Automated Proof Construction in Type Theory Using Resolution , 2000, Journal of Automated Reasoning.

[57]  Tim Geisler,et al.  Efficient Model Generation through Compilation , 1996, CADE.

[58]  Sascha Böhme,et al.  Fast LCF-Style Proof Reconstruction for Z3 , 2010, ITP.

[59]  Lawrence C. Paulson,et al.  Set theory for verification. II: Induction and recursion , 1995, Journal of Automated Reasoning.

[60]  J. Blanchette,et al.  Monotonicity or How to Encode Polymorphic Types Safely and Efficiently , 2012 .

[61]  Geoff Sutcliffe,et al.  THF0 - The Core of the TPTP Language for Higher-Order Logic , 2008, IJCAR.

[62]  James Hook,et al.  Type-driven defunctionalization , 1997, ICFP '97.

[63]  Lawrence C. Paulson,et al.  Translating Higher-Order Clauses to First-Order Clauses , 2007, Journal of Automated Reasoning.

[64]  Makarius Wenzel Isabelle/Isar — a Generic Framework for Human-Readable Proof Documents , 2007 .

[65]  Lawrence C. Paulson,et al.  Source-Level Proof Reconstruction for Interactive Theorem Proving , 2007, TPHOLs.

[66]  Deborah L. McGuinness,et al.  Different Proofs are Good Proofs , 2012, EMSQMS@IJCAR.

[67]  Stefan Berghofer,et al.  Inductive Datatypes in HOL - Lessons Learned in Formal-Logic Engineering , 1999, TPHOLs.

[68]  Randy Pollack,et al.  Closure Under Alpha-Conversion , 1994, TYPES.

[69]  Wolfgang Reif,et al.  Flaw Detection in Formal Specifications , 2001, IJCAR.

[70]  Leslie Lamport,et al.  How to Make a Multiprocessor Computer That Correctly Executes Multiprocess Programs , 2016, IEEE Transactions on Computers.

[71]  Michael J. C. Gordon,et al.  Edinburgh LCF: A mechanised logic of computation , 1979 .

[72]  François Bobot,et al.  Expressing Polymorphic Types in a Many-Sorted Language , 2011, FroCoS.

[73]  Robert Veroff,et al.  Automated Reasoning and Its Applications: Essays in Honor of Larry Wos , 1997 .

[74]  Robin Milner,et al.  Definition of standard ML , 1990 .

[75]  Andreas Lochbihler Formalising FinFuns - Generating Code for Functions as Data from Isabelle/HOL , 2009, TPHOLs.

[76]  Tobias Nipkow,et al.  Random testing in Isabelle/HOL , 2004, Proceedings of the Second International Conference on Software Engineering and Formal Methods, 2004. SEFM 2004..

[77]  Clark W. Barrett,et al.  The SMT-LIB Standard Version 2.0 , 2010 .

[78]  Christoph Weidenbach,et al.  Computing Small Clause Normal Forms , 2001, Handbook of Automated Reasoning.

[79]  Stephan Merz,et al.  Model Checking , 2000 .

[80]  Jean-François Couchot,et al.  Handling Polymorphism in Automated Deduction , 2007, CADE.

[81]  Geoff Sutcliffe,et al.  The TPTP World - Infrastructure for Automated Reasoning , 2010, LPAR.

[82]  S. Owre Random Testing in PVS , 2006 .

[83]  W. McCune A Davis-Putnam program and its application to finite-order model search: Quasigroup existence problems , 1994 .

[84]  Panagiotis Manolios,et al.  Verification of executable pipelined machines with bit-level interfaces , 2005, ICCAD-2005. IEEE/ACM International Conference on Computer-Aided Design, 2005..

[85]  Stefan Klingenbeck Counter examples in semantic tableaux , 1997, DISKI.

[86]  Tjark Weber,et al.  SAT-based finite model generation for higher-order logic , 2008 .

[87]  Dexter Kozen,et al.  Automata and Computability , 1997, Undergraduate Texts in Computer Science.

[88]  Henk Barendregt,et al.  The Lambda Calculus: Its Syntax and Semantics , 1985 .

[89]  John Harrison,et al.  Inductive Definitions: Automation and Application , 1995, TPHOLs.

[90]  Benjamin Grégoire,et al.  A Modular Integration of SAT/SMT Solvers to Coq through Proof Witnesses , 2011, CPP.

[91]  M. Gordon,et al.  Introduction to HOL: a theorem proving environment for higher order logic , 1993 .

[92]  Hantao Zhang,et al.  SEM: a System for Enumerating Models , 1995, IJCAI.

[93]  Andrei Popescu,et al.  Foundational, Compositional (Co)datatypes for Higher-Order Logic: Category Theory Applied to Theorem Proving , 2012, 2012 27th Annual IEEE Symposium on Logic in Computer Science.

[94]  Tobias Nipkow,et al.  Automatic Proof and Disproof in Isabelle/HOL , 2011, FroCoS.

[95]  Geoff Sutcliffe The TPTP Problem Library and Associated Infrastructure , 2009, Journal of Automated Reasoning.

[96]  Stephan Merz,et al.  Expressiveness + Automation + Soundness: Towards Combining SMT Solvers and Interactive Proof Assistants , 2006, TACAS.

[97]  Jasmin Christian Blanchette,et al.  Monotonicity Inference for Higher-Order Formulas , 2011, Journal of Automated Reasoning.

[98]  Armin Biere,et al.  Symbolic Model Checking without BDDs , 1999, TACAS.

[99]  Christoph Lameter,et al.  Effective Synchronization on Linux/NUMA Systems , 2005 .

[100]  Lawrence C. Paulson,et al.  Set theory for verification: I. From foundations to functions , 1993, Journal of Automated Reasoning.

[101]  Lawrence C. Paulson,et al.  Automation for interactive proof: First prototype , 2006, Inf. Comput..

[102]  Niklas Sörensson,et al.  An Extensible SAT-solver , 2003, SAT.

[103]  Peter Sewell,et al.  Mathematizing C++ concurrency , 2011, POPL '11.

[104]  Calogero G. Zarba,et al.  Combining Decision Procedures for Sorted Theories , 2004, JELIA.

[105]  David Aspinall,et al.  On Validity of Program Transformations in the Java Memory Model , 2008, ECOOP.

[106]  Josef Urban,et al.  MaLARea: a Metasystem for Automated Reasoning in Large Theories , 2007, ESARLT.

[107]  Eero Hyvönen,et al.  CEUR Workshop Proceedings , 2008 .

[108]  John Matthews,et al.  Using Yices as an automated solver in Isabelle / HOL , 2008 .

[109]  Manu Sridharan,et al.  A micromodularity mechanism , 2001, ESEC/FSE-9.

[110]  D. A. Turner,et al.  A new implementation technique for applicative languages , 1979, Softw. Pract. Exp..

[111]  Zohar Manna,et al.  Property-directed incremental invariant generation , 2008, Formal Aspects of Computing.

[112]  B. Jacobs,et al.  A tutorial on (co)algebras and (co)induction , 1997 .

[113]  Koen Claessen,et al.  QuickCheck: a lightweight tool for random testing of Haskell programs , 2000, ICFP.

[114]  Fredrik Lindblad Property Directed Generation of First-Order Test Data , 2007, Trends in Functional Programming.

[115]  Koen Claessen,et al.  Generating Counterexamples for Structural Inductions by Exploiting Nonstandard Models , 2010, LPAR.

[116]  L. D. Moura,et al.  The YICES SMT Solver , 2006 .

[117]  Tobias Nipkow,et al.  Social Choice Theory in HOL , 2009, Journal of Automated Reasoning.

[118]  Sascha Böhme,et al.  Reconstruction of Z3's Bit-Vector Proofs in HOL4 and Isabelle/HOL , 2011, CPP.

[119]  Annabelle McIver,et al.  Towards Automated Proof Support for Probabilistic Distributed Systems , 2005, LPAR.

[120]  Daniel Jackson,et al.  Software Abstractions - Logic, Language, and Analysis , 2006 .

[121]  Harald Ganzinger,et al.  Resolution Theorem Proving , 2001, Handbook of Automated Reasoning.

[122]  Alexander Nadel Backtrack Search Algorithms for Propositional Logic Satisfiability: Review and Innovations , 2002 .

[123]  Amy P. Felty,et al.  An Integration of Resolution and Natural Deduction Theorem Proving , 1986, AAAI.

[124]  Xiaorong Huang,et al.  Translating Machine-Generated Resolution Proofs into ND-Proofs at the Assertion Level , 1996, PRICAI.

[125]  Susmit Sarkar,et al.  Nitpicking c++ concurrency , 2011, PPDP.

[126]  Edmund M. Clarke,et al.  Model Checking , 1999, Handbook of Automated Reasoning.

[127]  Christoph Benzmüller,et al.  Assertion-level Proof Representation with Under-Specification , 2004, Electron. Notes Theor. Comput. Sci..

[128]  Koen Claessen,et al.  New techniques that improve mace-style model nding , 2003 .

[129]  Tobias Nipkow,et al.  A Tutorial Introduction to Structured Isar Proofs , 2008 .

[130]  Geoff Sutcliffe System description : SystemOnTPTP , 2000 .

[131]  Matthew Wampler-Doty,et al.  A Complete Proof of the Robbins Conjecture , 2010, Arch. Formal Proofs.

[132]  Geoffrey Smith,et al.  A Sound Type System for Secure Flow Analysis , 1996, J. Comput. Secur..

[133]  Cesare Tinelli,et al.  The SMT-LIB Standard: Version 1.2 , 2005 .

[134]  Alexander Knapp,et al.  The Java Memory Model: Operationally, Denotationally, Axiomatically , 2007, ESOP.

[135]  Jörg H. Siekmann,et al.  Proof Development with Ωmega: The Irrationality of \(\sqrt 2\) , 2003 .

[136]  Andriy Dunets,et al.  Automated Flaw Detection in Algebraic Specifications , 2010, Journal of Automated Reasoning.

[137]  Stephan Merz,et al.  A TLA+ Proof System , 2008, LPAR Workshops.

[138]  Michael R. Lowry,et al.  Deductive Composition of Astronomical Software from Subroutine Libraries , 1994, CADE.

[139]  J. Storer Induction and Recursion , 2002 .

[140]  Viktor Kuncak,et al.  Satisfiability Modulo Recursive Programs , 2011, SAS.

[141]  Tobias Nipkow Verifying a Hotel Key Card System , 2006, ICTAC.

[142]  李幼升,et al.  Ph , 1989 .

[143]  Georg Struth,et al.  Automating Algebraic Methods in Isabelle , 2011, ICFEM.

[144]  Tobias Nipkow,et al.  Term rewriting and beyond — theorem proving in Isabelle , 1989, Formal Aspects of Computing.

[145]  John M. Rushby Tutorial: Automated Formal Methods with PVS, SAL, and Yices , 2006, Fourth IEEE International Conference on Software Engineering and Formal Methods (SEFM'06).

[146]  Emina Torlak,et al.  MemSAT: checking axiomatic specifications of memory models , 2010, PLDI '10.

[147]  Wayne Snyder,et al.  Basic Paramodulation , 1995, Inf. Comput..

[148]  Geoff Sutcliffe,et al.  TSTP Data-Exchange Formats for Automated Theorem Proving Tools , 2004 .

[149]  S C Kleene,et al.  Representation of Events in Nerve Nets and Finite Automata , 1951 .

[150]  Lawrence C. Paulson Three Years of Experience with Sledgehammer, a Practical Link between Automatic and Interactive Theorem Provers , 2012 .

[151]  J. Hurd First-Order Proof Tactics in Higher-Order Logic Theorem Provers In Proc , 2003 .

[152]  William McCune,et al.  Automated reasoning about elementary point-set topology , 2004, Journal of Automated Reasoning.

[153]  Jasmin Christian Blanchette,et al.  Three years of experience with Sledgehammer, a Practical Link Between Automatic and Interactive Theorem Provers , 2012, IWIL@LPAR.

[154]  William Pugh,et al.  The Java Memory Model Simulator , 2002 .

[155]  François Bobot,et al.  Why3: Shepherd Your Herd of Provers , 2011 .

[156]  C. Kirchner,et al.  Higher-order unification via explicit substitutions Extended Abstract , 1995, LICS 1995.

[157]  F. Obermeyer Automated equational reasoning in nondeterministic lambda-calculi modulo theories H * , 2009 .

[158]  Peter Sewell,et al.  Mathematizing C++ Concurrency: The Post-Rapperswil Model , 2010 .

[159]  Greg Nelson,et al.  Simplification by Cooperating Decision Procedures , 1979, TOPL.

[160]  Josef Urban,et al.  Escape to ATP for Mizar , 2011, PxTP.

[161]  Alonzo Church,et al.  A formulation of the simple theory of types , 1940, Journal of Symbolic Logic.

[162]  William McCune,et al.  System Description: IVY , 2000, CADE.

[163]  Daniel Jackson Nitpick: A Checkable Specification Language , 1996 .

[164]  Lee Momtahan Towards a Small Model Theorem for Data Independent Systems in Alloy , 2005, Electron. Notes Theor. Comput. Sci..

[165]  Alba Cristina Magalhaes Alves de Melo,et al.  Visual-MCM: Visualising Execution Histories on Multiple Memory Consistency Models , 1999, ACPC.

[166]  Viktor Kuncak,et al.  Relational analysis of algebraic datatypes , 2005, ESEC/FSE-13.

[167]  Andrei Voronkov,et al.  The design and implementation of VAMPIRE , 2002, AI Commun..

[168]  Makarius Wenzel Parallel Proof Checking in Isabelle/Isar , 2009 .

[169]  Andrei Voronkov,et al.  Sine Qua Non for Large Theory Reasoning , 2011, CADE.

[170]  Peter Baumgartner,et al.  The TPTP Typed First-Order Form with Arithmetic , 2012, LPAR.

[171]  Tobias Nipkow,et al.  Sledgehammer: Judgement Day , 2010, IJCAR.

[172]  Tobias Nipkow,et al.  Proof Synthesis and Reflection for Linear Arithmetic , 2008, Journal of Automated Reasoning.

[173]  Hans de Nivelle,et al.  Translation of resolution proofs into short first-order proofs without choice axioms , 2005, Inf. Comput..

[174]  Alexander Leitsch,et al.  Automated Model Building , 2010 .

[175]  Jasmin Christian Blanchette Relational Analysis of (Co)inductive Predicates, (Co)algebraic Datatypes, and (Co)recursive Functions , 2010, TAP@TOOLS.

[176]  Tjark Weber SMT solvers: new oracles for the HOL theorem prover , 2011, International Journal on Software Tools for Technology Transfer.

[177]  Peter B. Andrews An introduction to mathematical logic and type theory - to truth through proof , 1986, Computer science and applied mathematics.

[178]  John K. Slaney,et al.  FINDER: Finite Domain Enumerator - System Description , 1994, CADE.

[179]  Chad E. Brown,et al.  Analytic Tableaux for Higher-Order Logic with Choice , 2010, Journal of Automated Reasoning.

[180]  Peter Sewell,et al.  Clarifying and compiling C/C++ concurrency: from C++11 to POWER , 2012, POPL '12.

[181]  Peter Dybjer,et al.  Combining Testing and Proving in Dependent Type Theory , 2003, TPHOLs.

[182]  William Pugh The Java memory model is fatally flawed , 2000 .

[183]  Andrei Paskevich,et al.  TFF1: The TPTP Typed First-Order Form with Rank-1 Polymorphism , 2013, CADE.

[184]  Stephen Cole Kleene,et al.  On notation for ordinal numbers , 1938, Journal of Symbolic Logic.

[185]  Thomas Hillenbrand,et al.  WALDMEISTER - High-Performance Equational Deduction , 1997, Journal of Automated Reasoning.

[186]  Damián Barsotti,et al.  Verification of clock synchronization algorithms: experiments on a combination of deductive tools , 2007, Formal Aspects of Computing.

[187]  Gregor Snelting,et al.  A Correctness Proof for the Volpano/Smith Security Typing System , 2008, Arch. Formal Proofs.

[188]  W. Reif,et al.  Theorem Proving in Large Theories , 1998 .

[189]  André L. M. Santos,et al.  Compilation by transformation in non-strict functional languages , 1995 .

[190]  Amir Pnueli,et al.  The small model property: how small can it be? , 2002 .

[191]  Francesco Zappa Nardelli,et al.  Lem: A Lightweight Tool for Heavyweight Semantics , 2011, ITP.

[192]  Simon Foster,et al.  Integrating an Automated Theorem Prover into Agda , 2011, NASA Formal Methods.

[193]  Koen Claessen,et al.  Sort It Out with Monotonicity - Translating between Many-Sorted and Unsorted First-Order Logic , 2011, CADE.

[194]  Sascha Böhme,et al.  Proving Theorems of Higher-Order Logic with SMT Solvers , 2012 .

[195]  C PaulsonLawrence Set theory for verification. I , 1993 .

[196]  Tobias Nipkow,et al.  Nitpick: A Counterexample Generator for Higher-Order Logic Based on a Relational Model Finder , 2010, ITP.

[197]  Joe Hurd Integrating Gandalf and HOL , 1999, TPHOLs.