HOList: An Environment for Machine Learning of Higher-Order Theorem Proving (extended version)

We present an environment, benchmark, and deep learning driven automated theorem prover for higher-order logic. Higher-order interactive theorem provers enable the formalization of arbitrary mathematical theories and thereby present an interesting, open-ended challenge for deep learning. We provide an open-source framework based on the HOL Light theorem prover that can be used as a reinforcement learning environment. HOL Light comes with a broad coverage of basic mathematical theorems on calculus and the formal proof of the Kepler conjecture, from which we derive a challenging benchmark for automated reasoning. We also present a deep reinforcement learning driven automated theorem prover, DeepHOL, with strong initial results on this benchmark.

[1]  Norman D. Megill,et al.  Metamath A Computer Language for Pure Mathematics , 1969 .

[2]  Boris Polyak,et al.  Acceleration of stochastic approximation by averaging , 1992 .

[3]  Stephan Schulz,et al.  Learning search control knowledge for equational deduction , 2000, DISKI.

[4]  Stephan Schulz,et al.  E - a brainiac theorem prover , 2002, AI Commun..

[5]  Wolfgang Bibel,et al.  leanCoP: lean connection-based theorem proving , 2003, J. Symb. Comput..

[6]  Hazel Duncan,et al.  The use of data-mining for the automatic formation of tactics , 2004 .

[7]  James Bennett,et al.  The Netflix Prize , 2007 .

[8]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[9]  Michael Norrish,et al.  A Brief Overview of HOL4 , 2008, TPHOLs.

[10]  Josef Urban,et al.  MaLARea SG1- Machine Learner for Automated Reasoning with Semantic Guidance , 2008, IJCAR.

[11]  Tobias Nipkow,et al.  The Isabelle Framework , 2008, TPHOLs.

[12]  Michael Norrish,et al.  seL4: formal verification of an OS kernel , 2009, SOSP '09.

[13]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[14]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[15]  Yves Bertot,et al.  Interactive Theorem Proving and Program Development: Coq'Art The Calculus of Inductive Constructions , 2010 .

[16]  Josef Urban,et al.  MaLeCoP Machine Learning Connection Prover , 2011, TABLEAUX.

[17]  Cezary Kaliszyk,et al.  Initial Experiments with External Provers and Premise Selection on HOL Light Corpora , 2012, PAAR@IJCAR.

[18]  Josef Urban,et al.  Overview and Evaluation of Premise Selection Techniques for Large Theory Mathematics , 2012, IJCAR.

[19]  Cezary Kaliszyk,et al.  MaSh: Machine Learning for Sledgehammer , 2013, ITP.

[20]  Jesse Alama,et al.  Premise Selection for Mathematics by Corpus Analysis and Kernel Methods , 2011, Journal of Automated Reasoning.

[21]  Cezary Kaliszyk,et al.  Stronger Automation for Flyspeck by Feature Weighting and Strategy Evolution , 2013, PxTP@CADE.

[22]  Cezary Kaliszyk,et al.  Machine Learning of Coq Proof Guidance: First Experiments , 2014, SCSS.

[23]  Cezary Kaliszyk,et al.  Learning-Assisted Automated Reasoning with Flyspeck , 2012, Journal of Automated Reasoning.

[24]  James P. Bridge,et al.  Machine Learning for First-Order Theorem Proving , 2014, J. Autom. Reason..

[25]  Philipp Koehn,et al.  Findings of the 2014 Workshop on Statistical Machine Translation , 2014, WMT@ACL.

[26]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[27]  Cezary Kaliszyk,et al.  Machine Learner for Automated Reasoning 0.4 and 0.5 , 2014, PAAR@IJCAR.

[28]  Jeremy Avigad,et al.  The Lean Theorem Prover (System Description) , 2015, CADE.

[29]  Sanjeev Khudanpur,et al.  Librispeech: An ASR corpus based on public domain audio books , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[30]  Cezary Kaliszyk,et al.  MizAR 40 for Mizar 40 , 2013, Journal of Automated Reasoning.

[31]  Cezary Kaliszyk,et al.  Efficient Semantic Features for Automated Reasoning over Large Theories , 2015, IJCAI.

[32]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[33]  Cezary Kaliszyk,et al.  FEMaLeCoP: Fairly Efficient Machine Learning Connection Prover , 2015, LPAR.

[34]  Cezary Kaliszyk,et al.  Learning-assisted theorem proving with millions of lemmas☆ , 2015, J. Symb. Comput..

[35]  Thibault Gauthier,et al.  Premise Selection and External Provers for HOL4 , 2015, CPP.

[36]  Cezary Kaliszyk,et al.  Random Forests for Premise Selection , 2015, FroCos.

[37]  Jian Zhang,et al.  SQuAD: 100,000+ Questions for Machine Comprehension of Text , 2016, EMNLP.

[38]  Daniel Whalen,et al.  Holophrasm: a neural Automated Theorem Prover for higher-order logic , 2016, ArXiv.

[39]  Cezary Kaliszyk,et al.  A Learning-Based Fact Selector for Isabelle/HOL , 2016, Journal of Automated Reasoning.

[40]  Heiga Zen,et al.  WaveNet: A Generative Model for Raw Audio , 2016, SSW.

[41]  Josef Urban,et al.  DeepMath - Deep Sequence Models for Premise Selection , 2016, NIPS.

[42]  Jian Wang,et al.  Premise Selection for Theorem Proving by Deep Graph Embedding , 2017, NIPS.

[43]  Thibault Gauthier,et al.  TacticToe: Learning to Reason with HOL4 Tactics , 2017, LPAR.

[44]  Tobias Nipkow,et al.  A FORMAL PROOF OF THE KEPLER CONJECTURE , 2015, Forum of Mathematics, Pi.

[45]  Cezary Kaliszyk,et al.  Deep Network Guided Proof Search , 2017, LPAR.

[46]  Cezary Kaliszyk,et al.  HolStep: A Machine Learning Dataset for Higher-order Logic Theorem Proving , 2017, ICLR.

[47]  Cezary Kaliszyk,et al.  Reinforcement Learning of Theorem Proving , 2018, NeurIPS.

[48]  Dawn Xiaodong Song,et al.  GamePad: A Learning Environment for Theorem Proving , 2018, ICLR.

[49]  Henryk Michalewski,et al.  Curriculum Learning and Theorem Proving , 2019 .