论文信息 - A critique of word similarity as a method for evaluating distributional semantic models - 字舞流文

A critique of word similarity as a method for evaluating distributional semantic models

This paper aims to re-think the role of the word similarity task in distributional semantics research. We argue while it is a valuable tool, it should be used with care because it provides only an approximate measure of the quality of a distributional model. Word similarity evaluations assume there exists a single notion of similarity that is independent of a particular application. Further, the small size and low inter-annotator agreement of existing data sets makes it challenging to find significant differences between models.

David J. Weir | Julie Weeds | Jeremy Reffin | Thomas Kober | Miroslav Batchkarov | Julie Weeds | Jeremy Reffin | Thomas Kober | Miroslav Batchkarov

[1] Ido Dagan,et al. Semantic Annotation for Textual Entailment Recognition , 2012, MICAI.

[2] Dimitri Kartsaklis,et al. Evaluating Neural Word Representations in Tensor-Based Compositional Settings , 2014, EMNLP.

[3] John B. Goodenough,et al. Contextual correlates of synonymy , 1965, CACM.

[4] JurafskyDaniel,et al. Dialogue act modeling for automatic tagging and recognition of conversational speech , 2000 .

[5] Ehud Rivlin,et al. Placing search in context: the concept revisited , 2002, TOIS.

[6] Eric K. Ringger,et al. Pulse: Mining Customer Opinions from Free Text , 2005, IDA.

[7] Jason Weston,et al. Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks , 2015, ICLR.

[8] M. Kenward,et al. An Introduction to the Bootstrap , 2007 .

[9] G. Miller,et al. Contextual correlates of semantic similarity , 1991 .

[10] Dan Roth,et al. “Ask Not What Textual Entailment Can Do for You...” , 2010, ACL.

[11] Peter D. Stetson,et al. Use of Semantic Features to Classify Patient Smoking Status , 2008, AMIA.

[12] Miroslav Batchkarov. Evaluating distributional models of compositional semantics , 2016 .

[13] Felix Hill,et al. SimLex-999: Evaluating Semantic Models With (Genuine) Similarity Estimation , 2014, CL.

[14] Noam Shazeer,et al. Swivel: Improving Embeddings by Noticing What's Missing , 2016, ArXiv.

[15] Elia Bruni,et al. Multimodal Distributional Semantics , 2014, J. Artif. Intell. Res..

[16] Christopher D. Manning,et al. Better Word Representations with Recursive Neural Networks for Morphology , 2013, CoNLL.

[17] T. Landauer,et al. A Solution to Plato's Problem: The Latent Semantic Analysis Theory of Acquisition, Induction, and Representation of Knowledge. , 1997 .

[18] Petr Sojka,et al. Software Framework for Topic Modelling with Large Corpora , 2010 .

[19] Janyce Wiebe,et al. Recognizing Contextual Polarity in Phrase-Level Sentiment Analysis , 2005, HLT.

[20] Alexander Yates,et al. Distributional Representations for Handling Sparsity in Supervised Sequence-Labeling , 2009, ACL.

[21] Jeffrey Dean,et al. Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[22] Thorsten Joachims,et al. Evaluation methods for unsupervised word embeddings , 2015, EMNLP.

[23] Isabelle Guyon,et al. Clustering: Science or Art? , 2009, ICML Unsupervised and Transfer Learning.

[24] Andrew Y. Ng,et al. Improving Word Representations via Global Context and Multiple Word Prototypes , 2012, ACL.

[25] Yoshua Bengio,et al. Word Representations: A Simple and General Method for Semi-Supervised Learning , 2010, ACL.

[26] Soo-Min Kim,et al. Determining the Sentiment of Opinions , 2004, COLING.