Learning to Prove from Synthetic Theorems

A major challenge in applying machine learning to automated theorem proving is the scarcity of training data, which is a key ingredient in training successful deep learning models. To tackle this problem, we propose an approach that relies on training with synthetic theorems, generated from a set of axioms. We show that such theorems can be used to train an automated prover and that the learned prover transfers successfully to human-generated theorems. We demonstrate that a prover trained exclusively on synthetic theorems can solve a substantial fraction of problems in TPTP, a benchmark dataset that is used to compare state-of-the-art heuristic provers. Our approach outperforms a model trained on human-generated problems in most axiom sets, thereby showing the promise of using synthetic data for this task.

[1]  Allen Newell,et al.  The logic theory machine-A complex information processing system , 1956, IRE Trans. Inf. Theory.

[2]  J. A. Robinson,et al.  A Machine-Oriented Logic Based on the Resolution Principle , 1965, JACM.

[3]  Charisma Lee A completeness theorem and a computer program for finding theorems derivable from given axioms , 1967 .

[4]  Douglas B. Lenat,et al.  Automated Theory Formation in Mathematics , 1977, IJCAI.

[5]  Melvin Fitting,et al.  First-Order Logic and Automated Theorem Proving , 1990, Graduate Texts in Computer Science.

[6]  Nachum Dershowitz,et al.  In handbook of automated reasoning , 2001 .

[7]  Reiner Hähnle,et al.  Tableaux and Related Methods , 2001, Handbook of Automated Reasoning.

[8]  Reiner Hähnle,et al.  Verification of Hardware Systems with First-Order Logic , 2002 .

[9]  Andrei Voronkov,et al.  The design and implementation of VAMPIRE , 2002, AI Commun..

[10]  Gordon Plotkin,et al.  A Note on Inductive Generalization , 2008 .

[11]  Johan Wittocx,et al.  Approximate Reasoning in First-Order Logic Theories , 2008, KR.

[12]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[13]  B. Rubin Faster, higher, stronger. , 2013, American journal of respiratory and critical care medicine.

[14]  Cezary Kaliszyk,et al.  MizAR 40 for Mizar 40 , 2013, Journal of Automated Reasoning.

[15]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[16]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[17]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[19]  Geoffrey E. Hinton,et al.  Layer Normalization , 2016, ArXiv.

[20]  Razvan Pascanu,et al.  Interaction Networks for Learning about Objects, Relations and Physics , 2016, NIPS.

[21]  Josef Urban,et al.  ENIGMA: Efficient Learning-Based Inference Guiding Machine , 2017, CICM.

[22]  Geoff Sutcliffe The TPTP Problem Library and Associated Infrastructure , 2017, Journal of Automated Reasoning.

[23]  Cezary Kaliszyk,et al.  Deep Network Guided Proof Search , 2017, LPAR.

[24]  Richard Socher,et al.  Pointer Sentinel Mixture Models , 2016, ICLR.

[25]  Cezary Kaliszyk,et al.  HolStep: A Machine Learning Dataset for Higher-order Logic Theorem Proving , 2017, ICLR.

[26]  J. Corneli,et al.  Intelligent Computer Mathematics 10th International Conference, CICM 2017 , 2017 .

[27]  Cezary Kaliszyk,et al.  Reinforcement Learning of Theorem Proving , 2018, NeurIPS.

[28]  Achille Fokoue,et al.  A Deep Reinforcement Learning based Approach to Learning Transferable Proof Guidance Strategies , 2019, ArXiv.

[29]  Dawn Xiaodong Song,et al.  GamePad: A Learning Environment for Theorem Proving , 2018, ICLR.

[30]  Jia Deng,et al.  Learning to Prove Theorems via Interacting with Proof Assistants , 2019, ICML.

[31]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[32]  Xavier Glorot,et al.  Learning representations of Logical Formulae using Graph Neural Networks , 2019 .

[33]  Simon Cruanes,et al.  Faster, Higher, Stronger: E 2.3 , 2019, CADE.

[34]  Jia Deng,et al.  Learning to Prove Theorems by Learning to Generate Theorems , 2020, NeurIPS.

[35]  Guillaume Lample,et al.  Deep Learning for Symbolic Mathematics , 2019, ICLR.