Direct vs. indirect evaluation of distributional thesauri

With the success of word embedding methods in various Natural Language Processing tasks, all the fields of distributional semantics have experienced a renewed interest. Beside the famous word2vec, recent studies have presented efficient techniques to build distributional thesaurus; in particular, Claveau et al. (2014) have already shown that Information Retrieval (IR) tools and concepts can be successfully used to build a thesaurus. In this paper, we address the problem of the evaluation of such thesauri or embedding models. Several evaluation scenarii are considered: direct evaluation through reference lexicons and specially crafted datasets, and indirect evaluation through a third party tasks, namely lexical subsitution and Information Retrieval. For this latter task, we adopt the query expansion framework proposed by Claveau and Kijak (2016). Through several experiments, we first show that the recent techniques for building distributional thesaurus outperform the word2vec approach, whatever the evaluation scenario. We also highlight the differences between the evaluation scenarii, which may lead to very different conclusions when comparing distributional models. Last, we study the effect of some parameters of the distributional models on these various evaluation scenarii.

[1]  Stephen E. Robertson,et al.  A domain-independent approach to finding related entities , 2012, Inf. Process. Manag..

[2]  Omer Levy,et al.  Neural Word Embedding as Implicit Matrix Factorization , 2014, NIPS.

[3]  Geoffrey Zweig,et al.  Linguistic Regularities in Continuous Space Word Representations , 2013, NAACL.

[4]  Yasuhiro Ogawa,et al.  Selection of Effective Contextual Information for Automatic Synonym Acquisition , 2006, ACL.

[5]  Jean-Cédric Chappelier,et al.  Textual similarities based on a distributional approach , 1999, Proceedings. Tenth International Workshop on Database and Expert Systems Applications. DEXA 99.

[6]  T. Van de Cruys,et al.  Mining for meaning: the extraction of lexico-semantic knowledge from text , 2010 .

[7]  Alessandro Lenci,et al.  How we BLESSed distributional semantic evaluation , 2011, GEMS.

[8]  Olivier Ferret,et al.  Typing Relations in Distributional Thesauri , 2015 .

[9]  Olivier Ferret Identifying Bad Semantic Neighbors for Improving Distributional Thesauri , 2013, ACL.

[10]  Vincent Claveau,et al.  Distributional Thesauri for Information Retrieval and vice versa , 2016, LREC.

[11]  Kazuhide Yamamoto,et al.  Even Unassociated Features Can Improve Lexical Distributional Similarity , 2010 .

[12]  George A. Miller,et al.  Introduction to WordNet: An On-line Lexical Database , 1990 .

[13]  B. Hammond Ontology , 2004, Lawrence Booth’s Book of Visions.

[14]  Victor Maojo,et al.  A context vector model for information retrieval , 2002, J. Assoc. Inf. Sci. Technol..

[15]  Patrick Pantel,et al.  From Frequency to Meaning: Vector Space Models of Semantics , 2010, J. Artif. Intell. Res..

[16]  Evgeniy Gabrilovich,et al.  Computing Semantic Relatedness Using Wikipedia-based Explicit Semantic Analysis , 2007, IJCAI.

[17]  Danushka Bollegala,et al.  Measuring semantic similarity between words using web search engines , 2007, WWW '07.

[18]  Gregory Grefenstette,et al.  Explorations in automatic thesaurus discovery , 1994 .

[19]  Vincent Claveau,et al.  Improving distributional thesauri by exploring the graph of neighbors , 2014, COLING.

[20]  Thierry Poibeau,et al.  Latent Vector Weighting for Word Meaning in Context , 2011, EMNLP.

[21]  Christopher D. Manning,et al.  Introduction to Information Retrieval , 2010, J. Assoc. Inf. Sci. Technol..

[22]  T. Landauer,et al.  A Solution to Plato's Problem: The Latent Semantic Analysis Theory of Acquisition, Induction, and Representation of Knowledge. , 1997 .

[23]  Graeme Hirst,et al.  Evaluating WordNet-based Measures of Lexical Semantic Relatedness , 2006, CL.

[24]  J. R. Firth,et al.  A Synopsis of Linguistic Theory, 1930-1955 , 1957 .

[25]  Peter D. Turney Mining the Web for Synonyms: PMI-IR versus LSA on TOEFL , 2001, ECML.

[26]  Ellen M. Voorhees,et al.  Query expansion using lexical-semantic relations , 1994, SIGIR '94.

[27]  Stan Szpakowicz,et al.  Rank-Based Transformation in Measuring Semantic Relatedness , 2009, Canadian Conference on AI.

[28]  W. Bruce Croft,et al.  Evaluation of an inference network-based retrieval model , 1991, TOIS.

[29]  Felix Hill,et al.  SimLex-999: Evaluating Semantic Models With (Genuine) Similarity Estimation , 2014, CL.

[30]  Dekang Lin,et al.  Automatic Retrieval and Clustering of Similar Words , 1998, ACL.

[31]  Mirella Lapata,et al.  Dependency-Based Construction of Semantic Space Models , 2007, CL.

[32]  T. Van de Cruys,et al.  Mining for meaning , 2010 .

[33]  Philippe Muller,et al.  Évaluer Et Améliorer Une Ressource Distributionnelle : Protocole D'annotation De Liens Sémantiques En Contexte , 2013, Trait. Autom. des Langues.

[34]  Stephen E. Robertson,et al.  Okapi at TREC-7: Automatic Ad Hoc, Filtering, VLC and Interactive , 1998, TREC.

[35]  Magnus Sahlgren,et al.  Vector-based semantic analysis: representing word meanings based on random labels , 2001 .

[36]  W. Bruce Croft,et al.  A Language Modeling Approach to Information Retrieval , 1998, SIGIR Forum.

[37]  Roberto Navigli,et al.  The English lexical substitution task , 2009, Lang. Resour. Evaluation.

[38]  Mehran Sahami,et al.  A web-based kernel function for measuring the similarity of short text snippets , 2006, WWW '06.

[39]  Ido Dagan,et al.  Articles: Bootstrapping Distributional Feature Vector Quality , 2009, CL.

[40]  W. Bruce Croft,et al.  Indri : A language-model based search engine for complex queries ( extended version ) , 2005 .

[41]  Andrew Y. Ng,et al.  Improving Word Representations via Global Context and Multiple Word Prototypes , 2012, ACL.

[42]  W. Bruce Croft,et al.  Combining the language model and inference network approaches to retrieval , 2004, Inf. Process. Manag..