Transformers Generalize to the Semantics of Logics

We show that neural networks can learn the semantics of propositional and linear-time temporal logic (LTL) from imperfect training data. Instead of only predicting the truth value of a formula, we use a Transformer architecture to predict the solution for a given formula, e.g., a variable assignment for a formula in propositional logic. Most formulas have many solutions and the training data thus depends on the particularities of the generator. We make the surprising observation that while the Transformer does not perfectly predict the generator's output, it still produces correct solutions to almost all formulas, even when its prediction deviates from the generator. It appears that it is easier to learn the semantics of the logics than the particularities of the generator. We observe that the Transformer preserves this semantic generalization even when challenged with formulas of a size it has never encountered before. Surprisingly, the Transformer solves almost all LTL formulas in our test set including those for which our generator timed out.

[1]  Rishabh Singh,et al.  Global Relational Models of Source Code , 2020, ICLR.

[2]  Kevin Waugh,et al.  DeepStack: Expert-Level Artificial Intelligence in No-Limit Poker , 2017, ArXiv.

[3]  Jesse Vig,et al.  A Multiscale Visualization of Attention in the Transformer Model , 2019, ACL.

[4]  Alexandre Duret-Lutz,et al.  Spot 2 . 0 — a framework for LTL and ω-automata manipulation , 2016 .

[5]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[6]  Marc Brockschmidt,et al.  Structured Neural Summarization , 2018, ICLR.

[7]  Leonidas J. Guibas,et al.  Learning Program Embeddings to Propagate Feedback on Student Code , 2015, ICML.

[8]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[9]  Chris Quirk,et al.  Novel positional encodings to enable tree-based transformers , 2019, NeurIPS.

[10]  Ah Chung Tsoi,et al.  The Graph Neural Network Model , 2009, IEEE Transactions on Neural Networks.

[11]  Moayad Fahim Ali,et al.  Fault diagnosis and logic debugging using Boolean satisfiability , 2005, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[12]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[13]  Mislav Balunovic,et al.  Learning to Solve SMT Formulas , 2018, NeurIPS.

[14]  Sarah M. Loos,et al.  Graph Representations for Higher-Order Logic and Theorem Proving , 2019, AAAI.

[15]  Pushmeet Kohli,et al.  Neuro-Symbolic Program Corrector for Introductory Programming Assignments , 2018, 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE).

[16]  Rahul Gupta,et al.  DeepFix: Fixing Common C Language Errors by Deep Learning , 2017, AAAI.

[17]  Samy Bengio,et al.  Tensor2Tensor for Neural Machine Translation , 2018, AMTA.

[18]  Edward A. Lee,et al.  Learning Heuristics for Quantified Boolean Formulas through Reinforcement Learning , 2018, ICLR.

[19]  Nikolaj Bjørner,et al.  Guiding High-Performance SAT Solvers with Unsat-Core Predictions , 2019, SAT.

[20]  Stefan Schwendimann,et al.  A New One-Pass Tableau Calculus for PLTL , 1998, TABLEAUX.

[21]  Bernd Finkbeiner,et al.  Bounded synthesis , 2012, International Journal on Software Tools for Technology Transfer.

[22]  Philip S. Yu,et al.  A Comprehensive Survey on Graph Neural Networks , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[23]  Sarah M. Loos,et al.  Mathematical Reasoning in Latent Space , 2019, ICLR.

[24]  Geoffrey E. Hinton,et al.  Layer Normalization , 2016, ArXiv.

[25]  Gilles Audemard,et al.  On the Glucose SAT Solver , 2018, Int. J. Artif. Intell. Tools.

[26]  Cyrus Shahabi,et al.  Diffusion Convolutional Recurrent Neural Network: Data-Driven Traffic Forecasting , 2017, ICLR.

[27]  Sarah M. Loos,et al.  HOList: An Environment for Machine Learning of Higher-Order Theorem Proving (extended version) , 2019, ArXiv.

[28]  Ming Yang,et al.  DeepFace: Closing the Gap to Human-Level Performance in Face Verification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[29]  Sanjit A. Seshia,et al.  Formal Specification for Deep Neural Networks , 2018, ATVA.

[30]  Sanjit A. Seshia,et al.  Towards Verified Artificial Intelligence , 2016, ArXiv.

[31]  Hossein Saiedian,et al.  Teaching formal methods early in the software engineering cirriculum , 2000, Thirteenth Conference on Software Engineering Education and Training.

[32]  Pushmeet Kohli,et al.  Analysing Mathematical Reasoning Abilities of Neural Models , 2019, ICLR.

[33]  Guillaume Lample,et al.  Deep Learning for Symbolic Mathematics , 2019, ICLR.

[34]  Sanjit A. Seshia,et al.  Compositional Falsification of Cyber-Physical Systems with Machine Learning Components , 2017, NFM.

[35]  Richard Evans,et al.  Can Neural Networks Understand Logical Entailment? , 2018, ICLR.

[36]  Demis Hassabis,et al.  Mastering the game of Go without human knowledge , 2017, Nature.

[37]  Amir Pnueli,et al.  The temporal logic of programs , 1977, 18th Annual Symposium on Foundations of Computer Science (sfcs 1977).

[38]  Ke Wang,et al.  Dynamic Neural Program Embedding for Program Repair , 2017, ICLR.

[39]  Neil Immerman,et al.  On the Unusual Effectiveness of Logic in Computer Science , 2001, Bulletin of Symbolic Logic.

[40]  Thibault Gauthier,et al.  TacticToe: Learning to Reason with HOL4 Tactics , 2017, LPAR.

[41]  Marc Brockschmidt,et al.  Learning to Represent Programs with Graphs , 2017, ICLR.

[42]  Swarat Chaudhuri,et al.  AI2: Safety and Robustness Certification of Neural Networks with Abstract Interpretation , 2018, 2018 IEEE Symposium on Security and Privacy (SP).

[43]  Min Wu,et al.  Safety Verification of Deep Neural Networks , 2016, CAV.

[44]  Armin Biere,et al.  Bounded model checking , 2003, Adv. Comput..

[45]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[46]  Jianfeng Gao,et al.  Enhancing the Transformer with Explicit Relational Encoding for Math Problem Solving , 2019, ArXiv.

[47]  Tiziana Margaria,et al.  Tools and algorithms for the construction and analysis of systems: a special issue for TACAS 2017 , 2001, International Journal on Software Tools for Technology Transfer.

[48]  George Kurian,et al.  Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation , 2016, ArXiv.

[49]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[50]  Samuel S. Schoenholz,et al.  Neural Message Passing for Quantum Chemistry , 2017, ICML.

[51]  Cezary Kaliszyk,et al.  Deep Network Guided Proof Search , 2017, LPAR.

[52]  Timon Gehr,et al.  An abstract domain for certifying neural networks , 2019, Proc. ACM Program. Lang..

[53]  David L. Dill,et al.  Learning a SAT Solver from Single-Bit Supervision , 2018, ICLR.