论文信息 - A Neural Knowledge Language Model - 字舞流文

A Neural Knowledge Language Model

Current language models have significant limitations in their ability to encode and decode knowledge. This is mainly because they acquire knowledge based on statistical co-occurrences, even if most of the knowledge words are rarely observed named entities. In this paper, we propose a Neural Knowledge Language Model (NKLM) which combines symbolic knowledge provided by a knowledge graph with the RNN language model. At each time step, the model predicts a fact on which the observed word is to be based. Then, a word is either generated from the vocabulary or copied from the knowledge graph. We train and test the model on a new dataset, WikiFacts. In experiments, we show that the NKLM significantly improves the perplexity while generating a much smaller number of unknown words. In addition, we demonstrate that the sampled descriptions include named entities which were used to be the unknown words in RNN language models.

Yoshua Bengio | Sungjin Ahn | Heeyoul Choi | Tanel Pärnamaa | Yoshua Bengio | Sungjin Ahn | Tanel Pärnamaa | Heeyoul Choi

[1] George A. Miller,et al. WordNet: A Lexical Database for English , 1995, HLT.

[2] Beatrice Santorini,et al. Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[3] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.

[4] Thomas Hofmann,et al. Topic-based language models using EM , 1999, EUROSPEECH.

[5] Blockin Blockin,et al. Quick Training of Probabilistic Neural Nets by Importance Sampling , 2003 .

[6] Michael I. Jordan,et al. Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[7] Yoshua Bengio,et al. Hierarchical Probabilistic Neural Network Language Model , 2005, AISTATS.

[8] Praveen Paritosh,et al. Freebase: a collaboratively created graph database for structuring human knowledge , 2008, SIGMOD Conference.

[9] Geoffrey E. Hinton,et al. A Scalable Hierarchical Distributed Language Model , 2008, NIPS.

[10] Lukás Burget,et al. Recurrent neural network based language model , 2010, INTERSPEECH.

[11] Jason Weston,et al. Learning Structured Embeddings of Knowledge Bases , 2011, AAAI.

[12] Yee Whye Teh,et al. A fast and simple algorithm for training neural probabilistic language models , 2012, ICML.

[13] Razvan Pascanu,et al. Theano: new features and speed improvements , 2012, ArXiv.

[14] Tara N. Sainath,et al. Deep Neural Network Language Models , 2012, WLM@NAACL-HLT.

[15] Geoffrey Zweig,et al. Context dependent recurrent neural network language model , 2012, 2012 IEEE Spoken Language Technology Workshop (SLT).

[16] Jason Weston,et al. Translating Embeddings for Modeling Multi-relational Data , 2013, NIPS.

[17] Danqi Chen,et al. Reasoning With Neural Tensor Networks for Knowledge Base Completion , 2013, NIPS.

[18] Andrew McCallum,et al. Relation Extraction with Matrix Factorization and Universal Schemas , 2013, NAACL.

[19] Yoon Kim,et al. Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[20] Thorsten Brants,et al. One billion word benchmark for measuring progress in statistical language modeling , 2013, INTERSPEECH.

[21] Richard Socher,et al. A Neural Network for Factoid Question Answering over Paragraphs , 2014, EMNLP.

[22] Alex Graves,et al. Neural Turing Machines , 2014, ArXiv.

[23] Wojciech Zaremba,et al. Recurrent Neural Network Regularization , 2014, ArXiv.

[24] Yoshua Bengio,et al. On Using Very Large Target Vocabulary for Neural Machine Translation , 2014, ACL.

[25] Jason Weston,et al. End-To-End Memory Networks , 2015, NIPS.

[26] Yoshua Bengio,et al. On Using Monolingual Corpora in Neural Machine Translation , 2015, ArXiv.

[27] Jason Weston,et al. Large-scale Simple Question Answering with Memory Networks , 2015, ArXiv.

[28] Jason Weston,et al. Memory Networks , 2014, ICLR.

[29] Dilek Z. Hakkani-Tür,et al. Enriching Word Embeddings Using Knowledge Graph for Semantic Tagging in Conversational Dialog Systems , 2015, AAAI Spring Symposia.

[30] Michael Gamon,et al. Representing Text for Joint Embedding of Text and Knowledge Bases , 2015, EMNLP.

[31] Quoc V. Le,et al. A Neural Conversational Model , 2015, ArXiv.

[32] Navdeep Jaitly,et al. Pointer Networks , 2015, NIPS.

[33] Phil Blunsom,et al. Teaching Machines to Read and Comprehend , 2015, NIPS.

[34] John Miller,et al. Traversing Knowledge Graphs in Vector Space , 2015, EMNLP.

[35] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[36] Evgeniy Gabrilovich,et al. A Review of Relational Machine Learning for Knowledge Graphs , 2015, Proceedings of the IEEE.

[37] Joelle Pineau,et al. Building End-To-End Dialogue Systems Using Generative Hierarchical Neural Network Models , 2015, AAAI.

[38] Yonghui Wu,et al. Exploring the Limits of Language Modeling , 2016, ArXiv.

[39] Jason Weston,et al. The Goldilocks Principle: Reading Children's Books with Explicit Memory Representations , 2015, ICLR.

[40] Jason Weston,et al. Dialog-based Language Learning , 2016, NIPS.

[41] David Grangier,et al. Neural Text Generation from Structured Data with Application to the Biography Domain , 2016, EMNLP.

[42] Andrew McCallum,et al. Row-less Universal Schema , 2016, AKBC@NAACL-HLT.

[43] Jackie Chi Kit Cheung,et al. Leveraging Lexical Resources for Learning Entity Embeddings in Multi-Relational Data , 2016, ACL.

[44] Bowen Zhou,et al. Pointing the Unknown Words , 2016, ACL.

[45] Jason Weston,et al. Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks , 2015, ICLR.

[46] Xin Jiang,et al. Neural Generative Question Answering , 2015, IJCAI.

[47] Hang Li,et al. “ Tony ” DNN Embedding for “ Tony ” Selective Read for “ Tony ” ( a ) Attention-based Encoder-Decoder ( RNNSearch ) ( c ) State Update s 4 SourceVocabulary Softmax Prob , 2016 .

[48] Richard Socher,et al. Pointer Sentinel Mixture Models , 2016, ICLR.