论文信息 - Specialising Word Vectors for Lexical Entailment - 字舞流文

Specialising Word Vectors for Lexical Entailment

We present LEAR (Lexical Entailment Attract-Repel), a novel post-processing method that transforms any input word vector space to emphasise the asymmetric relation of lexical entailment (LE), also known as the IS-A or hyponymy-hypernymy relation. By injecting external linguistic constraints (e.g., WordNet links) into the initial vector space, the LE specialisation procedure brings true hyponymy-hypernymy pairs closer together in the transformed Euclidean space. The proposed asymmetric distance measure adjusts the norms of word vectors to reflect the actual WordNet-style hierarchy of concepts. Simultaneously, a joint objective enforces semantic similarity using the symmetric cosine distance, yielding a vector space specialised for both lexical relations at once. LEAR specialisation achieves state-of-the-art performance in the tasks of hypernymy directionality, hypernymy detection, and graded lexical entailment, demonstrating the effectiveness and robustness of the proposed asymmetric specialisation model.

Ivan Vulić | Nikola Mrkšić | N. Mrksic | Ivan Vulic

[1] Felix Hill,et al. SimLex-999: Evaluating Semantic Models With (Genuine) Similarity Estimation , 2014, CL.

[2] Kevin Gimpel,et al. From Paraphrase Database to Compositional Paraphrase Model and Back , 2015, Transactions of the Association for Computational Linguistics.

[3] Gemma Boleda,et al. Inclusive yet Selective: Supervised Distributional Hypernymy Detection , 2014, COLING.

[4] Danqi Chen,et al. A Fast and Accurate Dependency Parser using Neural Networks , 2014, EMNLP.

[5] Haixun Wang,et al. Learning Term Embeddings for Hypernymy Identification , 2015, IJCAI.

[6] Tie-Yan Liu,et al. Knowledge-Powered Deep Learning for Word Embedding , 2014, ECML/PKDD.

[7] Yu Hu,et al. Learning Semantic Word Embeddings based on Ordinal Knowledge Constraints , 2015, ACL.

[8] David J. Weir,et al. Learning to Distinguish Hypernyms and Co-Hyponyms , 2014, COLING.

[9] Christopher Potts,et al. A large annotated corpus for learning natural language inference , 2015, EMNLP.

[10] Steven Skiena,et al. Polyglot: Distributed Word Representations for Multilingual NLP , 2013, CoNLL.

[11] Yoram Singer,et al. Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..

[12] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.

[13] Ngoc Thang Vu,et al. Hierarchical Embeddings for Hypernymy Detection and Directionality , 2017, EMNLP.

[14] Goran Glavas,et al. Dual Tensor Model for Detecting Asymmetric Lexico-Semantic Relations , 2017, EMNLP.

[15] Ido Dagan,et al. The Distributional Inclusion Hypotheses and Lexical Entailment , 2005, ACL.

[16] Martha Palmer,et al. Verb Semantics and Lexical Selection , 1994, ACL.

[17] Anna Korhonen,et al. Morph-fitting: Fine-Tuning Word Vector Spaces with Simple Language-Specific Rules , 2017, ACL.

[18] Chris Dyer,et al. Ontologically Grounded Multi-sense Representation Learning for Semantic Vector Space Models , 2015, NAACL.

[19] Felix Hill,et al. HyperLex: A Large-Scale Evaluation of Graded Lexical Entailment , 2016, CL.

[20] H. Kamp,et al. Prototype theory and compositionality , 1995, Cognition.

[21] Felix Hill,et al. SimVerb-3500: A Large-Scale Evaluation Set of Verb Similarity , 2016, EMNLP.

[22] Ngoc Thang Vu,et al. Integrating Distributional Lexical Contrast into Word Embeddings for Antonym-Synonym Distinction , 2016, ACL.

[23] Ted Briscoe,et al. Looking for Hyponyms in Vector Space , 2014, CoNLL.

[24] Stephen Clark,et al. Exploiting Image Generality for Lexical Entailment Detection , 2015, ACL.

[25] Douwe Kiela,et al. Poincaré Embeddings for Learning Hierarchical Representations , 2017, NIPS.

[26] Raffaella Bernardi,et al. Entailment above the word level in distributional semantics , 2012, EACL.

[27] Dan Roth,et al. Robust Cross-lingual Hypernymy Detection using Dependency Context , 2018, NAACL-HLT.

[28] Jingwei Zhang,et al. Word Semantic Representations using Bayesian Probabilistic Tensor Factorization , 2014, EMNLP.

[29] Yoshua Bengio,et al. Word Representations: A Simple and General Method for Semi-Supervised Learning , 2010, ACL.

[30] Alessandro Lenci,et al. How we BLESSed distributional semantic evaluation , 2011, GEMS.

[31] Goran Glavas,et al. Post-Specialisation: Retrofitting Vectors of Words Unseen in Lexical Resources , 2018, NAACL.

[32] James A. Hampton,et al. Typicality, Graded Membership, and Vagueness , 2007, Cogn. Sci..

[33] Michael Mohler,et al. Semantic Signatures for Example-Based Linguistic Metaphor Detection , 2013 .

[34] Uri Zernik,et al. Lexical acquisition: Exploiting on-line resources to build a lexicon. , 1991 .

[35] Mark Dredze,et al. Improving Lexical Embeddings with Semantic Knowledge , 2014, ACL.

[36] Sanja Fidler,et al. Order-Embeddings of Images and Language , 2015, ICLR.

[37] Ido Dagan,et al. Recognizing Textual Entailment: Models and Applications , 2013, Recognizing Textual Entailment: Models and Applications.

[38] Aurélie Herbelot,et al. Measuring semantic content in distributional vectors , 2013, ACL.

[39] Daniel Jurafsky,et al. Semantic Taxonomy Induction from Heterogenous Evidence , 2006, ACL.

[40] Ted Pedersen,et al. WordNet::Similarity - Measuring the Relatedness of Concepts , 2004, NAACL.

[41] Steve Young,et al. Semantic Specialization of Distributional Word Vector Spaces using Monolingual and Cross-Lingual Constraints , 2017 .

[42] Dominik Schlechtweg,et al. Hypernyms under Siege: Linguistically-motivated Artillery for Hypernymy Detection , 2016, EACL.

[43] Makoto Miwa,et al. Word Embedding-based Antonym Detection using Thesauri and Distributional Information , 2015, NAACL.

[44] Zellig S. Harris,et al. Distributional Structure , 1954 .

[45] Chris Callison-Burch,et al. PPDB 2.0: Better paraphrase ranking, fine-grained entailment relations, word embeddings, and style classification , 2015, ACL.

[46] Qin Lu,et al. Chasing Hypernyms in Vector Spaces with Entropy , 2014, EACL.

[47] David Vandyke,et al. Counter-fitting Word Vectors to Linguistic Constraints , 2016, NAACL.

[48] Gang Wang,et al. RC-NET: A General Framework for Incorporating Knowledge into Word Representations , 2014, CIKM.

[49] Tsung-Hsien Wen,et al. Neural Belief Tracker: Data-Driven Dialogue State Tracking , 2016, ACL.

[50] Christopher D. Manning,et al. Bilingual Word Embeddings for Phrase-Based Machine Translation , 2013, EMNLP.

[51] James Henderson,et al. A Vector Space for Distributional Semantics for Entailment , 2016, ACL.

[52] Simone Paolo Ponzetto,et al. BabelNet: The automatic construction, evaluation and application of a wide-coverage multilingual semantic network , 2012, Artif. Intell..

[53] Allan Collins,et al. Experiments on semantic memory and language comprehension. , 1972 .

[54] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[55] Shashi Narayan,et al. Encoding Prior Knowledge with Eigenword Embeddings , 2015, TACL.

[56] Tomas Mikolov,et al. Enriching Word Vectors with Subword Information , 2016, TACL.

[57] Siu Cheung Hui,et al. Learning Term Embeddings for Taxonomic Relation Identification Using Dynamic Weighting Neural Network , 2016, EMNLP.

[58] Daoud Clarke. Context-theoretic Semantics for Natural Language: an Overview , 2009 .

[59] Jason Weston,et al. Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[60] Andrew McCallum,et al. Word Representations via Gaussian Embedding , 2014, ICLR.

[61] Ido Dagan,et al. Directional distributional similarity for lexical inference , 2010, Natural Language Engineering.

[62] Stefano Faralli,et al. A Graph-Based Algorithm for Inducing Lexical Taxonomies from Scratch , 2011, IJCAI.

[63] Alessandro Lenci,et al. Identifying hypernyms in distributional semantic spaces , 2012, *SEMEVAL.

[64] Ido Dagan,et al. Improving Hypernymy Detection with an Integrated Path-based and Distributional Method , 2016, ACL.

[65] Stephen Clark,et al. Specializing Word Embeddings for Similarity or Relatedness , 2015, EMNLP.

[66] Marc Peter Deisenroth,et al. Neural Embeddings of Graphs in Hyperbolic Space , 2017, ArXiv.

[67] Chu-Ren Huang,et al. EVALution 1.0: an Evolving Semantic Dataset for Training and Evaluation of Distributional Semantic Models , 2015, LDL@IJCNLP.

[68] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[69] Anna Korhonen,et al. Cross-Lingual Induction and Transfer of Verb Classes Based on Word Vector Space Specialisation , 2017, EMNLP.

[70] David J. Weir,et al. Characterising Measures of Lexical Distributional Similarity , 2004, COLING.

[71] Ido Dagan,et al. context2vec: Learning Generic Context Embedding with Bidirectional LSTM , 2016, CoNLL.

[72] Mark W. Altom,et al. Given versus induced category representations: use of prototype and exemplar information in classification. , 1984, Journal of experimental psychology. Learning, memory, and cognition.

[73] Kathleen McKeown,et al. Classifying Taxonomic Relations between Pairs of Wikipedia Articles , 2013, IJCNLP.