Can Neural Networks Understand Logical Entailment?

We introduce a new dataset of logical entailments for the purpose of measuring models' ability to capture and exploit the structure of logical expressions against an entailment prediction task. We use this task to compare a series of architectures which are ubiquitous in the sequence-processing literature, in addition to a new model class---PossibleWorldNets---which computes entailment as a "convolution over possible worlds". Results show that convolutional networks present the wrong inductive bias for this class of problems relative to LSTM RNNs, tree-structured neural networks outperform LSTM RNNs due to their enhanced ability to exploit the syntax of logic, and PossibleWorldNets outperform all benchmarks.

[1]  Niklas Een,et al.  MiniSat v1.13 - A SAT Solver with Conflict-Clause Minimization , 2005 .

[2]  Phil Blunsom,et al.  Learning to Transduce with Unbounded Memory , 2015, NIPS.

[3]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[4]  Hongyu Guo,et al.  Long Short-Term Memory Over Recursive Structures , 2015, ICML.

[5]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[6]  Christopher Potts,et al.  A large annotated corpus for learning natural language inference , 2015, EMNLP.

[7]  Jian Zhang,et al.  SQuAD: 100,000+ Questions for Machine Comprehension of Text , 2016, EMNLP.

[8]  Alexander M. Rush,et al.  Character-Aware Neural Language Models , 2015, AAAI.

[9]  Benjamin C. Pierce,et al.  Types and programming languages: the next generation , 2003, 18th Annual IEEE Symposium of Logic in Computer Science, 2003. Proceedings..

[10]  Nando de Freitas,et al.  Neural Programmer-Interpreters , 2015, ICLR.

[11]  Ido Dagan,et al.  The Third PASCAL Recognizing Textual Entailment Challenge , 2007, ACL-PASCAL@ACL.

[12]  Phil Blunsom,et al.  Recurrent Continuous Translation Models , 2013, EMNLP.

[13]  Phil Blunsom,et al.  A Convolutional Neural Network for Modelling Sentences , 2014, ACL.

[14]  Jason Weston,et al.  The Goldilocks Principle: Reading Children's Books with Explicit Memory Representations , 2015, ICLR.

[15]  Christopher Potts,et al.  Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank , 2013, EMNLP.

[16]  Tomas Mikolov,et al.  Inferring Algorithmic Patterns with Stack-Augmented Recurrent Nets , 2015, NIPS.

[17]  Martín Abadi,et al.  TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.

[18]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[19]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[20]  Wang Ling,et al.  Latent Predictor Networks for Code Generation , 2016, ACL.

[21]  Phil Blunsom,et al.  Teaching Machines to Read and Comprehend , 2015, NIPS.

[22]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[23]  Sergio Gomez Colmenarejo,et al.  Hybrid computing using a neural network with dynamic external memory , 2016, Nature.

[24]  J Quinonero Candela,et al.  Machine Learning Challenges. Evaluating Predictive Uncertainty, Visual Object Classification, and Recognising Tectual Entailment , 2006, Lecture Notes in Computer Science.

[25]  Razvan Pascanu,et al.  Learning to Navigate in Complex Environments , 2016, ICLR.

[26]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[27]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[28]  Andrew Y. Ng,et al.  Semantic Compositionality through Recursive Matrix-Vector Spaces , 2012, EMNLP.

[29]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[30]  Lihong Li,et al.  Neuro-Symbolic Program Synthesis , 2016, ICLR.

[31]  Phong Le,et al.  Compositional Distributional Semantics with Long Short Term Memory , 2015, *SEMEVAL.

[32]  Yuandong Tian,et al.  Better Computer Go Player with Neural Network and Long-term Prediction , 2016, ICLR.

[33]  Wojciech Zaremba,et al.  Learning to Discover Efficient Mathematical Identities , 2014, NIPS.

[34]  Yoshua Bengio,et al.  Convolutional networks for images, speech, and time series , 1998 .

[35]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[36]  Xiang Zhang,et al.  Character-level Convolutional Networks for Text Classification , 2015, NIPS.

[37]  Pushmeet Kohli,et al.  RobustFill: Neural Program Learning under Noisy I/O , 2017, ICML.

[38]  Lukasz Kaiser,et al.  Neural GPUs Learn Algorithms , 2015, ICLR.

[39]  Christopher D. Manning,et al.  Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks , 2015, ACL.

[40]  Bowen Zhou,et al.  ABCNN: Attention-Based Convolutional Neural Network for Modeling Sentence Pairs , 2015, TACL.

[41]  Shuohang Wang,et al.  Learning Natural Language Inference with LSTM , 2015, NAACL.

[42]  Phil Blunsom,et al.  Reasoning about Entailment with Neural Attention , 2015, ICLR.

[43]  Pushmeet Kohli,et al.  Learning Continuous Semantic Representations of Symbolic Expressions , 2016, ICML.