Distinguishing Antonyms and Synonyms in a Pattern-based Neural Network

Distinguishing between antonyms and synonyms is a key task to achieve high performance in NLP systems. While they are notoriously difficult to distinguish by distributional co-occurrence models, pattern-based methods have proven effective to differentiate between the relations. In this paper, we present a novel neural network model AntSynNET that exploits lexico-syntactic patterns from syntactic parse trees. In addition to the lexical and syntactic information, we successfully integrate the distance between the related words along the syntactic path as a new pattern feature. The results from classification experiments show that AntSynNET improves the performance over prior pattern-based methods.

[1]  Michael Roth,et al.  Combining Word Patterns and Discourse Markers for Paradigmatic Relation Classification , 2014, ACL.

[2]  Roy Schwartz,et al.  Symmetric Pattern Based Word Embeddings for Improved Word Similarity Prediction , 2015, CoNLL.

[3]  Ming Zhou,et al.  Identifying Synonyms among Distributionally Similar Words , 2003, IJCAI.

[4]  J. Deese The structure of associations in language and thought , 1966 .

[5]  Patrick Pantel,et al.  From Frequency to Meaning: Vector Space Models of Semantics , 2010, J. Artif. Intell. Res..

[6]  Geoffrey Zweig,et al.  Polarity Inducing Latent Semantic Analysis , 2012, EMNLP.

[7]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[8]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[9]  Qin Lu,et al.  Chasing Hypernyms in Vector Spaces with Entropy , 2014, EACL.

[10]  Slava M. Katz,et al.  Co-Occurrences of Antonymous Adjectives and Their Contexts , 1991, Comput. Linguistics.

[11]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[12]  G. Miller,et al.  Contexts of antonymous adjectives , 1989, Applied Psycholinguistics.

[13]  G. Miller,et al.  Semantic networks of english , 1991, Cognition.

[14]  Zellig S. Harris,et al.  Distributional Structure , 1954 .

[15]  Sabine Schulte im Walde,et al.  Pattern-Based Distinction of Paradigmatic Relations for German Nouns, Verbs, Adjectives , 2013, GSCL.

[16]  Makoto Miwa,et al.  Word Embedding-based Antonym Detection using Thesauri and Distributional Information , 2015, NAACL.

[17]  Matthew D. Zeiler ADADELTA: An Adaptive Learning Rate Method , 2012, ArXiv.

[18]  Geoffrey Zweig,et al.  Linguistic Regularities in Continuous Space Word Representations , 2013, NAACL.

[19]  John Salvatier,et al.  Theano: A Python framework for fast computation of mathematical expressions , 2016, ArXiv.

[20]  J. Firth,et al.  Papers in linguistics, 1934-1951 , 1957 .

[21]  Marti A. Hearst Automatic Acquisition of Hyponyms from Large Text Corpora , 1992, COLING.

[22]  Christiane Fellbaum,et al.  Co-Occurrence and Antonymy , 1995 .

[23]  Ngoc Thang Vu,et al.  Combining Recurrent and Convolutional Neural Networks for Relation Classification , 2016, NAACL.

[24]  Ido Dagan,et al.  Improving Hypernymy Detection with an Integrated Path-based and Distributional Method , 2016, ACL.

[25]  Sabine Schulte im Walde,et al.  Uncovering Distributional Differences between Synonyms and Antonyms in a Word Space Model , 2013, IJCNLP.

[26]  Ngoc Thang Vu,et al.  Integrating Distributional Lexical Contrast into Word Embeddings for Antonym-Synonym Distinction , 2016, ACL.

[27]  Raffaella Bernardi,et al.  Entailment above the word level in distributional semantics , 2012, EACL.