Enriching Word Embeddings Using Knowledge Graph for Semantic Tagging in Conversational Dialog Systems

Unsupervised word embeddings provide rich linguistic and conceptual information about words. However, they may provide weak information about domain specific semantic relations for certain tasks such as semantic parsing of natural language queries, where such information about words can be valuable. To encode the prior knowledge about the semantic word relations, we present new method as follows: We extend the neural network based lexical word embedding objective function Mikolov, et.al. 2013 by incorporating the information about relationship between entities that we extract from knowledge bases. Our model can jointly learn lexical word representations from free text enriched by the relational word embeddings from relational data (e.g., Freebase) for each type of entity relations. We empirically show on the task of semantic tagging of natural language queries that our enriched embeddings can provide information about not only short-range syntactic dependencies but also long-range semantic dependencies between words. Using the enriched embeddings, we obtain an average of 2% improvement in F-score compared to the previous baselines.

[1]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[2]  Jason Weston,et al.  Irreflexive and Hierarchical Relations as Translations , 2013, ArXiv.

[3]  Kevin Gimpel,et al.  Tailoring Continuous Word Representations for Dependency Parsing , 2014, ACL.

[4]  John Langford,et al.  Search-based structured prediction , 2009, Machine Learning.

[5]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[6]  Ruhi Sarikaya,et al.  Shrinkage based features for slot tagging with conditional random fields , 2014, INTERSPEECH.

[7]  Mark Dredze,et al.  Improving Lexical Embeddings with Semantic Knowledge , 2014, ACL.

[8]  Dan Klein,et al.  Structure compilation: trading structure for features , 2008, ICML '08.

[9]  Geoffrey Zweig,et al.  Probabilistic enrichment of knowledge graph entities for relation detection in conversational understanding , 2014, INTERSPEECH.

[10]  Ronan Collobert,et al.  Is Deep Learning Really Necessary for Word Embeddings , 2013 .

[11]  Andrew Y. Ng,et al.  Solving the Problem of Cascading Errors: Approximate Bayesian Inference for Linguistic Annotation Pipelines , 2006, EMNLP.

[12]  Ming Zhou,et al.  Learning Sentiment-Specific Word Embedding for Twitter Sentiment Classification , 2014, ACL.

[13]  Jason Weston,et al.  Connecting Language and Knowledge Bases with Embedding Models for Relation Extraction , 2013, EMNLP.

[14]  Nick Craswell,et al.  Random walks on the click graph , 2007, SIGIR.