论文信息 - Learning to Reason with Third-Order Tensor Products

Learning to Reason with Third-Order Tensor Products

We combine Recurrent Neural Networks with Tensor Product Representations to learn combinatorial representations of sequential data. This improves symbolic interpretation and systematic generalisation. Our architecture is trained end-to-end through gradient descent on a variety of simple natural language reasoning tasks, significantly outperforming the latest state-of-the-art models in single-task and all-tasks settings. We also augment a subset of the data such that training and test data exhibit large systematic differences and show that our approach generalises better than the previous state-of-the-art.

Jürgen Schmidhuber | Imanol Schlag | J. Schmidhuber | Imanol Schlag

[1] F. L. Hitchcock. The Expression of a Tensor or a Polyadic as a Sum of Products , 1927 .

[2] Henry J. Kelley,et al. Gradient Theory of Optimal Flight Paths , 1960 .

[3] Geoffrey E. Hinton. Using fast weights to deblur old memories , 1987 .

[4] PAUL J. WERBOS,et al. Generalization of backpropagation with application to a recurrent gas market model , 1988, Neural Networks.

[5] J. Fodor,et al. Connectionism and cognitive architecture: A critical analysis , 1988, Cognition.

[6] Paul J. Werbos,et al. Backpropagation Through Time: What It Does and How to Do It , 1990, Proc. IEEE.

[7] J. Fodor,et al. Connectionism and the problem of systematicity: Why Smolensky's solution doesn't work , 1990, Cognition.

[8] Sepp Hochreiter,et al. Untersuchungen zu dynamischen neuronalen Netzen , 1991 .

[9] Eduardo Sontag,et al. Turing computability with neural nets , 1991 .

[10] Geoffrey E. Hinton. Tensor Product Variable Binding and the Representation of Symbolic Structures in Connectionist Systems , 1991 .

[11] E. Capaldi,et al. The organization of behavior. , 1992, Journal of applied behavior analysis.

[12] O. Brousse. Generativity and systematicity in neural network combinatorial learning , 1992 .

[13] Jürgen Schmidhuber,et al. Learning to Control Fast-Weight Memories: An Alternative to Dynamic Recurrent Networks , 1992, Neural Computation.

[14] Robert F. Hadley. Systematicity in Connectionist Language Learning , 1994 .

[15] Christoph von der Malsburg,et al. The Correlation Theory of Brain Function , 1994 .

[16] Ronald J. Williams,et al. Gradient-based learning algorithms for recurrent networks and their computational complexity , 1995 .

[17] Steven Phillips,et al. Connectionism and the problem of systematicity , 1995 .

[18] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.

[19] Jürgen Schmidhuber,et al. Learning to Forget: Continual Prediction with LSTM , 2000, Neural Computation.

[20] Jerome A. Feldman,et al. Dynamic connections in neural networks , 1990, Biological Cybernetics.

[21] Yoshua Bengio,et al. Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[22] P. Smolensky. Symbolic functions from neural computation , 2012, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[23] Anima Anandkumar,et al. Tensor Decompositions for Learning Latent Variable Models (A Survey for ALT) , 2015, ALT.

[24] Jason Weston,et al. End-To-End Memory Networks , 2015, NIPS.

[25] Jason Weston,et al. Weakly Supervised Memory Networks , 2015, ArXiv.

[26] Richard Socher,et al. Ask Me Anything: Dynamic Memory Networks for Natural Language Processing , 2015, ICML.

[27] Joshua B. Tenenbaum,et al. Building machines that learn and think like people , 2016, Behavioral and Brain Sciences.

[28] Sergio Gomez Colmenarejo,et al. Hybrid computing using a neural network with dynamic external memory , 2016, Nature.

[29] Jianfeng Gao,et al. Basic Reasoning with Tensor Product Representations , 2016, ArXiv.

[30] Jonathan Berant,et al. Learning to generalize to new compositions in image understanding , 2016, ArXiv.

[31] Jason Weston,et al. Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks , 2015, ICLR.

[32] Alex Graves,et al. Scaling Memory-Augmented Neural Networks with Sparse Reads and Writes , 2016, NIPS.

[33] Geoffrey E. Hinton,et al. Using Fast Weights to Attend to the Recent Past , 2016, NIPS.

[34] Timothy Dozat,et al. Incorporating Nesterov Momentum into Adam , 2016 .