Sledgehammer: Judgement Day

Sledgehammer, a component of the interactive theorem prover Isabelle, finds proofs in higher-order logic by calling the automated provers for first-order logic E, SPASS and Vampire. This paper is the largest and most detailed empirical evaluation of such a link to date. Our test data consists of 1240 proof goals arising in 7 diverse Isabelle theories, thus representing typical Isabelle proof obligations. We measure the effectiveness of Sledgehammer and many other parameters such as run time and complexity of proofs. A facility for minimizing the number of facts needed to prove a goal is presented and analyzed.

[1]  Geoff Sutcliffe The 4th IJCAR Automated Theorem Proving System Competition - CASC-J4 , 2009, AI Commun..

[2]  Lawrence C. Paulson,et al.  Source-Level Proof Reconstruction for Interactive Theorem Proving , 2007, TPHOLs.

[3]  Andrei Voronkov,et al.  The design and implementation of VAMPIRE , 2002, AI Commun..

[4]  Larry Wos,et al.  Efficiency and Completeness of the Set of Support Strategy in Theorem Proving , 1965, JACM.

[5]  Xiaorong Huang,et al.  Reconstruction Proofs at the Assertion Level , 1994, CADE.

[6]  Myla Archer,et al.  Design and Application of Strategies/Tactics in Higher Order Logics , 2003 .

[7]  Geoff Sutcliffe System Description: SystemOn TPTP , 2000, CADE.

[8]  Stephan Schulz,et al.  E - a brainiac theorem prover , 2002, AI Commun..

[9]  Geoff Sutcliffe,et al.  TSTP Data-Exchange Formats for Automated Theorem Proving Tools , 2004 .

[10]  Alan Bundy,et al.  Automated Deduction — CADE-12 , 1994, Lecture Notes in Computer Science.

[11]  Lawrence C. Paulson,et al.  Lightweight relevance filtering for machine-generated resolution problems , 2009, J. Appl. Log..

[12]  J. Hurd First-Order Proof Tactics in Higher-Order Logic Theorem Provers In Proc , 2003 .

[13]  Markus Wenzel,et al.  Isar - A Generic Interpretative Approach to Readable Formal Proof Documents , 1999, TPHOLs.

[14]  Lawrence C. Paulson,et al.  Translating Higher-Order Clauses to First-Order Clauses , 2007, Journal of Automated Reasoning.

[15]  Christoph Weidenbach,et al.  SPASS Version 3.5 , 2009, CADE.

[16]  Tobias Nipkow,et al.  A Proof Assistant for Higher-Order Logic , 2002 .

[17]  Josef Urban,et al.  MPTP 0.2: Design, Implementation, and Initial Experiments , 2006, Journal of Automated Reasoning.

[18]  Zohar Manna,et al.  Property-directed incremental invariant generation , 2008, Formal Aspects of Computing.

[19]  David Aspinall,et al.  Formalising Java's Data Race Free Guarantee , 2007, TPHOLs.

[20]  David A. McAllester,et al.  Automated Deduction - CADE-17 , 2000, Lecture Notes in Computer Science.

[21]  Weixiong Zhang,et al.  Distributed Constraint Problem Solving And Reasoning In Multi-Agent Systems , 2004 .