Antonym-Synonym Classification Based on New Sub-space Embeddings

Distinguishing antonyms from synonyms is a key challenge for many NLP applications focused on the lexical-semantic relation extraction. Existing solutions relying on large-scale corpora yield low performance because of huge contextual overlap of antonym and synonym pairs. We propose a novel approach entirely based on pre-trained embeddings. We hypothesize that the pre-trained embeddings comprehend a blend of lexical-semantic information and we may distill the task-specific information using Distiller, a model proposed in this paper. Later, a classifier is trained based on features constructed from the distilled sub-spaces along with some word level features to distinguish antonyms from synonyms. Experimental results show that the proposed model outperforms existing research on antonym synonym distinction in both speed and performance.

[1]  P ? ? ? ? ? ? ? % ? ? ? ? , 1991 .

[2]  Ngoc Thang Vu,et al.  Integrating Distributional Lexical Contrast into Word Embeddings for Antonym-Synonym Distinction , 2016, ACL.

[3]  Ivan Vulić,et al.  Specialising Word Vectors for Lexical Entailment , 2017, NAACL.

[4]  Graeme Hirst,et al.  Computing Lexical Contrast , 2013, CL.

[5]  Douwe Kiela,et al.  Poincaré Embeddings for Learning Hierarchical Representations , 2017, NIPS.

[6]  Hinrich Schütze,et al.  Word Embedding Calculus in Meaningful Ultradense Subspaces , 2016, ACL.

[7]  Goran Glavas,et al.  Discriminating between Lexico-Semantic Relations with the Specialization Tensor Model , 2018, NAACL.

[8]  Angeliki Lazaridou,et al.  A Multitask Objective to Inject Lexical Contrast into Distributional Semantics , 2015, ACL.

[9]  Ngoc Thang Vu,et al.  Distinguishing Antonyms and Synonyms in a Pattern-based Neural Network , 2017, EACL.

[10]  Chris Callison-Burch,et al.  Learning Antonyms with Paraphrases and a Morphology-Aware Neural Network , 2017, *SEM.

[11]  Jason Weston,et al.  Translating Embeddings for Modeling Multi-relational Data , 2013, NIPS.

[12]  Roy Schwartz,et al.  Symmetric Pattern Based Word Embeddings for Improved Word Similarity Prediction , 2015, CoNLL.

[13]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[14]  Seong-Bae Park,et al.  A Translation-Based Knowledge Graph Embedding Preserving Logical Property of Relations , 2016, HLT-NAACL.

[15]  Yutaka Matsuo,et al.  Deep contextualized word representations for detecting sarcasm and irony , 2018, WASSA@EMNLP.

[16]  Andrew K. Lampinen,et al.  One-shot and few-shot learning of word embeddings , 2017, ArXiv.

[17]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[18]  Sabine Schulte im Walde,et al.  Uncovering Distributional Differences between Synonyms and Antonyms in a Word Space Model , 2013, IJCNLP.

[19]  Ivan Vulic,et al.  Injecting Lexical Contrast into Word Vectors by Guiding Vector Space Specialisation , 2018, Rep4NLP@ACL.

[20]  Michael Roth,et al.  Combining Word Patterns and Discourse Markers for Paradigmatic Relation Classification , 2014, ACL.

[21]  Tsuyoshi Murata,et al.  {m , 1934, ACML.

[22]  Makoto Miwa,et al.  Word Embedding-based Antonym Detection using Thesauri and Distributional Information , 2015, NAACL.

[23]  Patrick Pantel,et al.  From Frequency to Meaning: Vector Space Models of Semantics , 2010, J. Artif. Intell. Res..

[24]  Heike Adel,et al.  Using Mined Coreference Chains as a Resource for a Semantic Task , 2014, EMNLP.

[25]  Matthijs Douze,et al.  FastText.zip: Compressing text classification models , 2016, ArXiv.

[26]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[27]  Ming Zhou,et al.  Identifying Synonyms among Distributionally Similar Words , 2003, IJCAI.

[28]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[29]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[30]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.