An Improved Crowdsourcing Based Evaluation Technique for Word Embedding Methods

In this proposal track paper, we have presented a crowdsourcing-based word embedding evaluation technique that will be more reliable and linguistically justified. The method is designed for intrinsic evaluation and extends the approach proposed in (Schnabel et al., 2015). Our improved evaluation technique captures word relatedness based on the word context.

[1]  Makoto Nagao,et al.  General Word Sense Disambiguation Method Based on a Full Sentential Context , 1998, WordNet@ACL/COLING.

[2]  Rada Mihalcea,et al.  Word Sense Disambiguation Using Wikipedia , 2013, The People's Web Meets NLP.

[3]  Ted Pedersen,et al.  Using Measures of Semantic Relatedness for Word Sense Disambiguation , 2003, CICLing.

[4]  Dean P. Foster,et al.  Two Step CCA: A new spectral method for estimating vector models of words , 2012, ICML 2012.

[5]  Thorsten Joachims,et al.  Evaluation methods for unsupervised word embeddings , 2015, EMNLP.

[6]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[7]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[8]  Raymond J. Mooney,et al.  Multi-Prototype Vector-Space Models of Word Meaning , 2010, NAACL.

[9]  Ignacio Iacobacci,et al.  SensEmbed: Learning Sense Embeddings for Word and Relational Similarity , 2015, ACL.

[10]  Christine Chiarello,et al.  Sentence context and lexical ambiguity resolution by the two hemispheres , 1998, Neuropsychologia.

[11]  Rada Mihalcea,et al.  Sense Clustering Using Wikipedia , 2013, RANLP.

[12]  E. H. Hutten SEMANTICS , 1953, The British Journal for the Philosophy of Science.

[13]  BiemannChris Creating a system for lexical substitutions from scratch using crowdsourcing , 2013 .

[14]  Dean P. Foster,et al.  Eigenwords: spectral word embeddings , 2015, J. Mach. Learn. Res..

[15]  Ronan Collobert,et al.  Word Embeddings through Hellinger PCA , 2013, EACL.

[16]  Judy Pearsall,et al.  Oxford Dictionary of English , 2010 .

[17]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[18]  Tie-Yan Liu,et al.  WordRep: A Benchmark for Research on Learning Word Representations , 2014, ArXiv.

[19]  Andrew Y. Ng,et al.  Improving Word Representations via Global Context and Multiple Word Prototypes , 2012, ACL.

[20]  George A. Miller,et al.  A Semantic Concordance , 1993, HLT.

[21]  Mark Stevenson,et al.  Mapping WordNet synsets to Wikipedia articles , 2012, LREC.

[22]  Chris Callison-Burch,et al.  Fast, Cheap, and Creative: Evaluating Translation Quality Using Amazon’s Mechanical Turk , 2009, EMNLP.

[23]  Piek T. J. M. Vossen,et al.  DutchSemCor: in quest of the ideal sense-tagged corpus , 2013, RANLP.

[24]  Kenneth Ward Church,et al.  Very sparse random projections , 2006, KDD '06.

[25]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[26]  Jaime G. Carbonell,et al.  Active learning and crowdsourcing for machine translation in low resource scenarios , 2012 .

[27]  Andrew McCallum,et al.  Efficient Non-parametric Estimation of Multiple Embeddings per Word in Vector Space , 2014, EMNLP.

[28]  Rada Mihalcea,et al.  Using Wikipedia for Automatic Word Sense Disambiguation , 2007, NAACL.

[29]  Christian Biemann Creating a system for lexical substitutions from scratch using crowdsourcing , 2013, Lang. Resour. Evaluation.

[30]  Omer Levy,et al.  A Simple Word Embedding Model for Lexical Substitution , 2015, VS@HLT-NAACL.

[31]  Roberto Navigli,et al.  Meaningful Clustering of Senses Helps Boost Word Sense Disambiguation Performance , 2006, ACL.

[32]  Linfeng Song,et al.  Word Embeddings , Sense Embeddings and their Application to Word Sense Induction , 2016 .

[33]  Jaime G. Carbonell,et al.  Active Learning and Crowd-Sourcing for Machine Translation , 2010, LREC.