Ontology-Aware Token Embeddings for Prepositional Phrase Attachment

Type-level word embeddings use the same set of parameters to represent all instances of a word regardless of its context, ignoring the inherent lexical ambiguity in language. Instead, we embed semantic concepts (or synsets) as defined in WordNet and represent a word token in a particular context by estimating a distribution over relevant semantic concepts. We use the new, context-sensitive embeddings in a model for predicting prepositional phrase(PP) attachments and jointly learn the concept embeddings and model parameters. We show that using context-sensitive embeddings improves the accuracy of the PP attachment model by 5.4% absolute points, which amounts to a 34.4% relative reduction in errors.

[1]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[2]  Philip Resnik,et al.  Semantic Classes and Syntactic Ambiguity , 1993, HLT.

[3]  Adwait Ratnaparkhi,et al.  A Maximum Entropy Model for Prepositional Phrase Attachment , 1994, HLT.

[4]  Mark Dredze,et al.  Improving Lexical Embeddings with Semantic Knowledge , 2014, ACL.

[5]  Andrew McCallum,et al.  Efficient Non-parametric Estimation of Multiple Embeddings per Word in Vector Space , 2014, EMNLP.

[6]  Guillaume Lample,et al.  Neural Architectures for Named Entity Recognition , 2016, NAACL.

[7]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[8]  Zhiyuan Liu,et al.  A Unified Model for Word Sense Representation and Disambiguation , 2014, EMNLP.

[9]  Eneko Agirre,et al.  Selectional Preferences for Semantic Role Classification , 2013, CL.

[10]  Danqi Chen,et al.  A Fast and Accurate Dependency Parser using Neural Networks , 2014, EMNLP.

[11]  Timothy Baldwin,et al.  Improving Parsing and PP Attachment Performance with Sense Information , 2008, ACL.

[12]  Eric Brill,et al.  A Rule-Based Approach to Prepositional Phrase Attachment Disambiguation , 1994, COLING.

[13]  Regina Barzilay,et al.  Low-Rank Tensors for Scoring Dependency Structures , 2014, ACL.

[14]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[15]  Andrew McCallum,et al.  Word Representations via Gaussian Embedding , 2014, ICLR.

[16]  Richard Johansson,et al.  Embedding a Semantic Network in a Word Space , 2015, NAACL.

[17]  Andrew Y. Ng,et al.  Improving Word Representations via Global Context and Multiple Word Prototypes , 2012, ACL.

[18]  Hinrich Schütze,et al.  AutoExtend: Extending Word Embeddings to Embeddings for Synsets and Lexemes , 2015, ACL.

[19]  Yonatan Belinkov,et al.  Exploring Compositional Architectures and Word Vector Representations for Prepositional Phrase Attachment , 2014, Transactions of the Association for Computational Linguistics.

[20]  Neville Ryant,et al.  A large-scale classification of English verbs , 2008, Lang. Resour. Evaluation.

[21]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[22]  Sham M. Kakade,et al.  A Linear Dynamical System Model for Text , 2015, ICML.

[23]  Raymond J. Mooney,et al.  Multi-Prototype Vector-Space Models of Word Meaning , 2010, NAACL.

[24]  Sanjeev Arora,et al.  Linear Algebraic Structure of Word Senses, with Applications to Polysemy , 2016, TACL.

[25]  Chris Dyer,et al.  Ontologically Grounded Multi-sense Representation Learning for Semantic Vector Space Models , 2015, NAACL.