论文信息 - Attentive Tensor Product Learning

Attentive Tensor Product Learning

This paper proposes a novel neural architecture — Attentive Tensor Product Learning (ATPL) — to represent grammatical structures of natural language in deep learning models. ATPL exploits Tensor Product Representations (TPR), a structured neural-symbolic model developed in cognitive science, to integrate deep learning with explicit natural language structures and rules. The key ideas of ATPL are: 1) unsupervised learning of role-unbinding vectors of words via the TPR-based deep neural network; 2) the use of attention modules to compute TPR; and 3) the integration of TPR with typical deep learning architectures including long short-term memory and feedforward neural networks. The novelty of our approach lies in its ability to extract the grammatical structure of a sentence by using role-unbinding vectors, which are obtained in an unsupervised manner. Our ATPL approach is applied to 1) image captioning, 2) part of speech (POS) tagging, and 3) constituency parsing of a natural language sentence. The experimental results demonstrate the effectiveness of the proposed approach in all these three natural language processing tasks.

[1] Alon Lavie,et al. METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments , 2005, IEEvaluation@ACL.

[2] Michael D. Byrne,et al. Cognitive Architecture , 2003 .

[3] Jianfeng Gao,et al. Reasoning in Vector Space: An Exploratory Study of Question Answering , 2016, ICLR.

[4] Wang Ling,et al. Learning to Compose Words into Sentences with Reinforcement Learning , 2016, ICLR.

[5] Samy Bengio,et al. Show and tell: A neural image caption generator , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[7] C. Lawrence Zitnick,et al. CIDEr: Consensus-based image description evaluation , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8] Fei-Fei Li,et al. Deep visual-semantic alignments for generating image descriptions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.

[10] Li Deng,et al. Grammatically-Interpretable Learned Representations in Deep NLP Models , 2018 .

[11] Christopher D. Manning,et al. Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks , 2015, ACL.

[12] Yue Zhang,et al. Fast and Accurate Shift-Reduce Constituent Parsing , 2013, ACL.

[13] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14] Li Deng,et al. A Neural-Symbolic Approach to Design of CAPTCHA. , 2017, 1710.11475.

[15] Paul Smolensky,et al. Tensor Product Variable Binding and the Representation of Symbolic Structures in Connectionist Systems , 1990, Artif. Intell..

[16] Tara N. Sainath,et al. Deep Neural Networks for Acoustic Modeling in Speech Recognition , 2012 .

[17] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[18] Ashutosh Vyas,et al. Deep Learning for Natural Language Processing , 2016 .

[19] Joe Pater. The harmonic mind : from neural computation to optimality-theoretic grammar , 2009 .

[20] Geoffrey E. Hinton,et al. Grammar as a Foreign Language , 2014, NIPS.

[21] David Weiss,et al. DRAGNN: A Transition-based Framework for Dynamically Connected Neural Networks , 2017, ArXiv.

[22] Dan Klein,et al. Deep Compositional Question Answering with Neural Module Networks , 2015, ArXiv.

[23] Stephen Clark,et al. Jointly learning sentence embeddings and syntax with unsupervised Tree-LSTMs , 2017, Natural Language Engineering.

[24] Li Deng,et al. Deep Learning for Image-to-Text Generation: A Technical Overview , 2017, IEEE Signal Processing Magazine.

[25] James H. Martin,et al. Speech and language processing: an introduction to natural language processing, computational linguistics, and speech recognition, 2nd Edition , 2000, Prentice Hall series in artificial intelligence.

[26] Dan Klein,et al. Feature-Rich Part-of-Speech Tagging with a Cyclic Dependency Network , 2003, NAACL.

[27] Geoffrey Zweig,et al. From captions to visual concepts and back , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28] William Ramsey. THE HARMONIC MIND: FROM NEURAL COMPUTATION TO OPTIMALITY‐THEORETIC GRAMMAR—VOLUME 1: COGNITIVE ARCHITECTURE AND VOLUME 2: LINGUISTIC AND PHILOSOPHICAL IMPLICATIONS , 2009 .

[29] Dan Klein,et al. Neural Module Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30] Wei Xu,et al. Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN) , 2014, ICLR.

[31] Jianfeng Gao,et al. Basic Reasoning with Tensor Product Representations , 2016, ArXiv.

[32] Li Deng,et al. Tensor Product Generation Networks for Deep NLP Modeling , 2017, NAACL.

[33] Richard Socher,et al. Ask Me Anything: Dynamic Memory Networks for Natural Language Processing , 2015, ICML.

[34] Geoffrey E. Hinton. Tensor Product Variable Binding and the Representation of Symbolic Structures in Connectionist Systems , 1991 .

[35] Zhe Gan,et al. Semantic Compositional Networks for Visual Captioning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).