Unraveling Antonym's Word Vectors through a Siamese-like Network

Discriminating antonyms and synonyms is an important NLP task that has the difficulty that both, antonyms and synonyms, contains similar distributional information. Consequently, pairs of antonyms and synonyms may have similar word vectors. We present an approach to unravel antonymy and synonymy from word vectors based on a siamese network inspired approach. The model consists of a two-phase training of the same base network: a pre-training phase according to a siamese model supervised by synonyms and a training phase on antonyms through a siamese-like model that supports the antitransitivity present in antonymy. The approach makes use of the claim that the antonyms in common of a word tend to be synonyms. We show that our approach outperforms distributional and pattern-based approaches, relaying on a simple feed forward network as base network of the training phases.

[1]  Yijia Liu,et al.  Towards Better UD Parsing: Deep Contextualized Word Embeddings, Ensemble, and Treebank Concatenation , 2018, CoNLL.

[2]  Ngoc Thang Vu,et al.  Integrating Distributional Lexical Contrast into Word Embeddings for Antonym-Synonym Distinction , 2016, ACL.

[3]  Matthijs Douze,et al.  FastText.zip: Compressing text classification models , 2016, ArXiv.

[4]  Yuan Yang,et al.  Palmprint Recognition Using Siamese Network , 2018, CCBR.

[5]  Arul Menezes,et al.  Effectively Using Syntax for Recognizing False Entailment , 2006, NAACL.

[6]  Nish Parikh,et al.  Abstractive and Extractive Text Summarization using Document Context Vector and Recurrent Neural Networks , 2018, ArXiv.

[7]  Makoto Miwa,et al.  Word Embedding-based Antonym Detection using Thesauri and Distributional Information , 2015, NAACL.

[8]  Luca Bertinetto,et al.  Fully-Convolutional Siamese Networks for Object Tracking , 2016, ECCV Workshops.

[9]  Jason Eisner,et al.  Lexical Semantics , 2020, The Handbook of English Linguistics.

[10]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[11]  Luke S. Zettlemoyer,et al.  Deep Contextualized Word Representations , 2018, NAACL.

[12]  Yoshua Bengio,et al.  Random Search for Hyper-Parameter Optimization , 2012, J. Mach. Learn. Res..

[13]  Andrew Y. Ng,et al.  Robust Textual Inference via Graph Matching , 2005, HLT.

[14]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[15]  Roy Schwartz,et al.  Symmetric Pattern Based Word Embeddings for Improved Word Similarity Prediction , 2015, CoNLL.

[16]  John B. Goodenough,et al.  Contextual correlates of synonymy , 1965, CACM.

[17]  H. P. Edmundson,et al.  Axiomatic Characterization of Synonymy and Antonymy , 1967, COLING.

[18]  Ngoc Thang Vu,et al.  Distinguishing Antonyms and Synonyms in a Pattern-based Neural Network , 2017, EACL.

[19]  Nachum Dershowitz,et al.  Using Synonyms for Arabic-to-English Example-Based Translation , 2010, AMTA.

[20]  Magnus Sahlgren,et al.  The Distributional Hypothesis , 2008 .

[21]  Ziming Chi,et al.  A Sentence Similarity Estimation Method Based on Improved Siamese Network , 2018 .

[22]  Anthony Bonato,et al.  Common Adversaries Form Alliances: Modelling Complex Networks via Anti-transitivity , 2017, WAW.

[23]  Yoshua Bengio,et al.  Embedding Word Similarity with Neural Machine Translation , 2014, ICLR.

[24]  Chu-Ren Huang,et al.  Taking Antonymy Mask off in Vector Space , 2014, PACLIC.

[25]  Sabine Schulte im Walde,et al.  Uncovering Distributional Differences between Synonyms and Antonyms in a Word Space Model , 2013, IJCNLP.

[26]  G. Miller,et al.  Contextual correlates of semantic similarity , 1991 .

[27]  Ivan Vulic,et al.  Injecting Lexical Contrast into Word Vectors by Guiding Vector Space Specialisation , 2018, Rep4NLP@ACL.

[28]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[29]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[30]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.