论文信息 - Logic and the 2-Simplicial Transformer - 字舞流文

Logic and the 2-Simplicial Transformer

We introduce the $2$-simplicial Transformer, an extension of the Transformer which includes a form of higher-dimensional attention generalising the dot-product attention, and uses this attention to update entity representations with tensor products of value vectors. We show that this architecture is a useful inductive bias for logical reasoning in the context of deep reinforcement learning.

James Clift | Daniel Murfet | Dmitry Doryn | James Wallbridge | James Clift | D. Doryn | Daniel Murfet | James Wallbridge

[1] David J. C. MacKay,et al. Information Theory, Inference, and Learning Algorithms , 2004, IEEE Transactions on Information Theory.

[2] Artur S. d'Avila Garcez,et al. Logic Tensor Networks: Deep Learning and Logical Reasoning from Data and Knowledge , 2016, NeSy@HLAI.

[3] C. Allen,et al. Stanford Encyclopedia of Philosophy , 2011 .

[4] Felix Hill,et al. Measuring abstract reasoning in neural networks , 2018, ICML.

[5] James Clift,et al. Cofree coalgebras and differential linear logic , 2017, Mathematical Structures in Computer Science.

[6] Razvan Pascanu,et al. Relational recurrent neural networks , 2018, NeurIPS.

[7] Russell A. Epstein,et al. The cognitive map in humans: spatial navigation and beyond , 2017, Nature Neuroscience.

[8] Andrew Zisserman,et al. Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps , 2013, ICLR.

[9] Paul-André Melliès. CATEGORICAL SEMANTICS OF LINEAR LOGIC , 2009 .

[10] Peter Gärdenfors,et al. Navigating cognition: Spatial codes for human thinking , 2018, Science.

[11] Max Jaderberg,et al. Population Based Training of Neural Networks , 2017, ArXiv.

[12] Geoffrey E. Hinton,et al. Generating Text with Recurrent Neural Networks , 2011, ICML.

[13] Jason Yosinski,et al. Multifaceted Feature Visualization: Uncovering the Different Types of Features Learned By Each Neuron in Deep Neural Networks , 2016, ArXiv.

[14] Jean-Yves Girard,et al. Linear Logic , 1987, Theor. Comput. Sci..

[15] Claire Cardie,et al. Modeling Compositionality with Multiplicative Recurrent Neural Networks , 2014, ICLR.

[16] V. Rich. Personal communication , 1989, Nature.

[17] Shan Carter,et al. Attention and Augmented Recurrent Neural Networks , 2016 .

[18] Chong Wang,et al. Neural Logic Machines , 2019, ICLR.

[19] Timothy E. J. Behrens,et al. Human Replay Spontaneously Reorganizes Experience , 2019, Cell.

[20] C. L. Giles,et al. Second-order recurrent neural networks for grammatical inference , 1991, IJCNN-91-Seattle International Joint Conference on Neural Networks.

[21] Sergey Ioffe,et al. Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22] G. G. Stokes. "J." , 1890, The New Yale Book of Quotations.

[23] Yoshua Bengio,et al. Describing Multimedia Content Using Attention-Based Encoder-Decoder Networks , 2015, IEEE Transactions on Multimedia.

[24] Timothy Edward John Behrens,et al. Generalisation of structural knowledge in the Hippocampal-Entorhinal system , 2018, NeurIPS.

[26] Robert B. Ash,et al. Information Theory , 2020, The SAGE International Encyclopedia of Mass Media and Society.

[27] Anton Dumitriu. History of logic , 1977 .

[28] Avanti Shrikumar,et al. Learning Important Features Through Propagating Activation Differences , 2017, ICML.

[29] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[30] Sergio Gomez Colmenarejo,et al. Hybrid computing using a neural network with dynamic external memory , 2016, Nature.

[31] R. Lambiotte,et al. From networks to optimal higher-order models of complex systems , 2019, Nature Physics.

[32] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.

[33] Percy Liang,et al. Understanding Black-box Predictions via Influence Functions , 2017, ICML.

[34] Danqi Chen,et al. Reasoning With Neural Tensor Networks for Knowledge Base Completion , 2013, NIPS.

[35] J. Pollack. The Induction of Dynamical Recognizers , 1996, Machine Learning.

[36] Alex Graves,et al. Neural Turing Machines , 2014, ArXiv.

[37] Razvan Pascanu,et al. Deep reinforcement learning with relational inductive biases , 2018, ICLR.

[38] Guigang Zhang,et al. Deep Learning , 2016, Int. J. Semantic Comput..

[39] Tom Eccles,et al. An investigation of model-free planning , 2019, ICML.

[40] George Boole,et al. The Mathematical Analysis of Logic: Being an Essay Towards a Calculus of Deductive Reasoning , 2007 .

[41] Giovanni Petri,et al. Simplex2Vec embeddings for community detection in simplicial complexes , 2019, ArXiv.

[42] Zeb Kurth-Nelson,et al. What Is a Cognitive Map? Organizing Knowledge for Flexible Behavior , 2018, Neuron.

[43] Samuel S. Schoenholz,et al. Neural Message Passing for Quantum Chemistry , 2017, ICML.

[44] Ilya Sutskever,et al. Generating Long Sequences with Sparse Transformers , 2019, ArXiv.

[45] Michał Walicki,et al. A HISTORY OF LOGIC , 2011 .

[46] Frédéric Chazal,et al. An Introduction to Topological Data Analysis: Fundamental and Practical Aspects for Data Scientists , 2017, Frontiers in Artificial Intelligence.

[47] J J Hopfield,et al. Neural networks and physical systems with emergent collective computational abilities. , 1982, Proceedings of the National Academy of Sciences of the United States of America.

[48] Wojciech Zaremba,et al. OpenAI Gym , 2016, ArXiv.

[49] Christopher D. Manning,et al. A Structural Probe for Finding Syntax in Word Representations , 2019, NAACL.

[50] C. Lee Giles,et al. Higher Order Recurrent Networks and Grammatical Inference , 1989, NIPS.

[51] Leyre Castro,et al. Animal learning. , 2010, Wiley interdisciplinary reviews. Cognitive science.

[52] Garret Sobczyk,et al. Simplicial calculus with Geometric Algebra , 1992 .

[53] Shane Legg,et al. IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures , 2018, ICML.

[54] Michael I. Jordan,et al. RLlib: Abstractions for Distributed Reinforcement Learning , 2017, ICML.

[55] Timothy E. J. Behrens,et al. Organizing conceptual knowledge in humans with a gridlike code , 2016, Science.

[56] Lukasz Kaiser,et al. Universal Transformers , 2018, ICLR.

[57] David Hestenes. New Foundations for Classical Mechanics , 1986 .

[58] Yoshua Bengio,et al. Understanding intermediate layers using linear classifier probes , 2016, ICLR.

[59] L. Goddard. Information Theory , 1962, Nature.

[60] Geoffrey E. Hinton,et al. Layer Normalization , 2016, ArXiv.

[61] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[62] Razvan Pascanu,et al. A simple neural network module for relational reasoning , 2017, NIPS.

[63] Aristotle,et al. Complete Works of Aristotle, Volume 1: The Revised Oxford Translation , 1984 .

[64] Philip Bachman,et al. Deep Reinforcement Learning that Matters , 2017, AAAI.

[65] Srimat T. Chakradhar,et al. First-order versus second-order single-layer recurrent neural networks , 1994, IEEE Trans. Neural Networks.

[66] Geoffrey E. Hinton,et al. Factored 3-Way Restricted Boltzmann Machines For Modeling Natural Images , 2010, AISTATS.

[67] S. Abramsky. Game Semantics , 1999 .