论文信息 - Embedding Semantic Relations into Word Representations

Embedding Semantic Relations into Word Representations

Learning representations for semantic relations is important for various tasks such as analogy detection, relational search, and relation classification. Although there have been several proposals for learning representations for individual words, learning word representations that explicitly capture the semantic relations between words remains under developed. We propose an unsupervised method for learning vector representations for words such that the learnt representations are sensitive to the semantic relations that exist between two words. First, we extract lexical patterns from the co-occurrence contexts of two words in a corpus to represent the semantic relations that exist between those two words. Second, we represent a lexical pattern as the weighted sum of the representations of the words that co-occur with that lexical pattern. Third, we train a binary classifier to detect relationally similar vs. non-similar lexical pattern pairs. The proposed method is unsupervised in the sense that the lexical pattern pairs we use as train data are automatically sampled from a corpus, without requiring any manual intervention. Our proposed method statistically significantly outperforms the current state-of-the-art word representations on three benchmark datasets for proportional analogy detection, demonstrating its ability to accurately capture the semantic relations among words.

Ken-ichi Kawarabayashi | Danushka Bollegala | Takanori Maehara

[1] Jason Weston,et al. Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[2] Alessandro Lenci,et al. Distributional Memory: A General Framework for Corpus-Based Semantics , 2010, CL.

[3] Yoram Singer,et al. Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..

[4] Saif Mohammad,et al. SemEval-2012 Task 2: Measuring Degrees of Relational Similarity , 2012, *SEMEVAL.

[5] Danushka Bollegala,et al. Relational duality: unsupervised extraction of semantic relations between entities on the web , 2010, WWW '10.

[6] Jeffrey Dean,et al. Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[7] E. Amari-Vaught,et al. Don't I count? , 1997, The Hastings Center report.

[8] Danushka Bollegala,et al. Cross-Language Latent Relational Search: Mapping Knowledge across Languages , 2011, AAAI.

[9] Omer Levy,et al. Linguistic Regularities in Sparse and Explicit Word Representations , 2014, CoNLL.

[10] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[11] Yoshimasa Tsuruoka,et al. Jointly Learning Word Representations and Composition Functions Using Predicate-Argument Structures , 2014, EMNLP.