MaSh: Machine Learning for Sledgehammer

Sledgehammer integrates automatic theorem provers in the proof assistant Isabelle/HOL. A key component, the relevance filter, heuristically ranks the thousands of facts available and selects a subset, based on syntactic similarity to the current goal. We introduce MaSh, an alternative that learns from successful proofs. New challenges arose from our "zero-click" vision: MaSh should integrate seamlessly with the users' workflow, so that they benefit from machine learning without having to install software, set up servers, or guide the learning. The underlying machinery draws on recent research in the context of Mizar and HOL Light, with a number of enhancements. MaSh outperforms the old relevance filter on large formalizations, and a particularly strong filter is obtained by combining the two filters.

[1]  Geoff Sutcliffe The 6th IJCAR automated theorem proving system competition - CASC-J6 , 2013, AI Commun..

[2]  Josef Urban,et al.  Learning from Multiple Proofs: First Experiments , 2012, PAAR@IJCAR.

[3]  David Delahaye,et al.  Proceedings of the 6th IJCAR ATP System Competition (CASC-J6) , 2012 .

[4]  Stephan Schulz,et al.  System Description: E 1.8 , 2013, LPAR.

[5]  Lawrence C. Paulson,et al.  Extending Sledgehammer with SMT Solvers , 2011, CADE.

[6]  Larry Wos,et al.  What Is Automated Reasoning? , 1987, J. Autom. Reason..

[7]  Josef Urban,et al.  MPTP 0.2: Design, Implementation, and Initial Experiments , 2006, Journal of Automated Reasoning.

[8]  Josef Urban,et al.  An Overview of Methods for Large-Theory Automated Theorem Proving , 2011, ATE.

[9]  R. Matuszewski,et al.  IZAR : the first 30 years , 2005 .

[10]  Josef Urban,et al.  Momm - Fast Interreduction and Retrieval in Large Libraries of Formalized Mathematics , 2006, Int. J. Artif. Intell. Tools.

[11]  Stephan Schulz,et al.  System Description: E 0.81 , 2004, IJCAR.

[12]  Dan Roth,et al.  SNoW User Guide , 1999 .

[13]  Lawrence C. Paulson,et al.  Lightweight relevance filtering for machine-generated resolution problems , 2009, J. Appl. Log..

[14]  Roman. Matuszewski,et al.  From insight to proof : Festschrift in honour of Andrzej Trybulec , 2007 .

[15]  Lawrence C. Paulson,et al.  The Inductive Approach to Verifying Cryptographic Protocols , 2021, J. Comput. Secur..

[16]  J. Hurd First-Order Proof Tactics in Higher-Order Logic Theorem Provers In Proc , 2003 .

[17]  Makarius Wenzel Isabelle/Isar — a Generic Framework for Human-Readable Proof Documents , 2007 .

[18]  Josef Urban,et al.  ATP and Presentation Service for Mizar Formalizations , 2011, Journal of Automated Reasoning.

[19]  Josef Urban,et al.  Overview and Evaluation of Premise Selection Techniques for Large Theory Mathematics , 2012, IJCAR.

[20]  Rajeev Alur,et al.  A Temporal Logic of Nested Calls and Returns , 2004, TACAS.

[21]  Makarius Wenzel Parallel Proof Checking in Isabelle/Isar , 2009 .

[22]  Andrei Voronkov,et al.  Sine Qua Non for Large Theory Reasoning , 2011, CADE.

[23]  Tobias Nipkow,et al.  Sledgehammer: Judgement Day , 2010, IJCAR.

[24]  David Aspinall,et al.  Formalising Java's Data Race Free Guarantee , 2007, TPHOLs.

[25]  Nikolaj Bjørner,et al.  Automated Deduction - CADE-23 - 23rd International Conference on Automated Deduction, Wroclaw, Poland, July 31 - August 5, 2011. Proceedings , 2011, CADE.

[26]  Andrei Popescu,et al.  More SPASS with Isabelle - Superposition with Hard Sorts and Configurable Simplification , 2012, ITP.

[27]  Josef Urban,et al.  MaLARea SG1- Machine Learner for Automated Reasoning with Semantic Guidance , 2008, IJCAR.

[28]  Tobias Nipkow,et al.  Proof Terms for Simply Typed Higher Order Logic , 2000, TPHOLs.

[29]  Jesse Alama,et al.  Automated and Human Proofs in General Mathematics: An Initial Comparison , 2012, LPAR.

[30]  Jesse Alama,et al.  Premise Selection for Mathematics by Corpus Analysis and Kernel Methods , 2011, Journal of Automated Reasoning.

[31]  Josef Urban,et al.  BliStr: The Blind Strategymaker , 2013, GCAI.

[32]  Christina L. Hennessey ACM Digital Library , 2012 .

[33]  Frank Wolter,et al.  Monodic fragments of first-order temporal logics: 2000-2001 A.D , 2001, LPAR.

[34]  Chad E. Brown,et al.  Satallax: An Automatic Higher-Order Prover , 2012, IJCAR.

[35]  Lawrence C. Paulson,et al.  LEO-II and Satallax on the Sledgehammer test bench , 2013, J. Appl. Log..

[36]  Myla Archer,et al.  Design and Application of Strategies/Tactics in Higher Order Logics , 2003 .

[37]  Josef Urban,et al.  Theorem Proving in Large Formal Mathematics as an Emerging AI Field , 2013, Automated Reasoning and Mathematics.

[38]  Tobias Nipkow,et al.  Jinja is not Java , 2005, Arch. Formal Proofs.

[39]  Geoff Sutcliffe The 4th IJCAR Automated Theorem Proving System Competition - CASC-J4 , 2009, AI Commun..

[40]  Andrei Voronkov,et al.  The design and implementation of VAMPIRE , 2002, AI Commun..

[41]  Nikolaj Bjørner,et al.  Z3: An Efficient SMT Solver , 2008, TACAS.

[42]  L. D. Moura,et al.  The YICES SMT Solver , 2006 .

[43]  Lawrence C. Paulson,et al.  LEO-II - A Cooperative Automatic Theorem Prover for Classical Higher-Order Logic (System Description) , 2008, IJCAR.

[44]  Cezary Kaliszyk,et al.  Learning-Assisted Automated Reasoning with Flyspeck , 2012, Journal of Automated Reasoning.

[45]  Josef Urban,et al.  MaLARea: a Metasystem for Automated Reasoning in Large Theories , 2007, ESARLT.

[46]  Johannes Hölzl,et al.  Three Chapters of Measure Theory in Isabelle/HOL , 2011, ITP.

[47]  Tobias Nipkow,et al.  A Proof Assistant for Higher-Order Logic , 2002 .

[48]  Maria Paola Bonacina,et al.  Automated Reasoning and Mathematics , 2013, Lecture Notes in Computer Science.