Toward a Computational Multidimensional Lexical Similarity Measure for Modeling Word Association Tasks in Psycholinguistics

This paper presents the first results of a multidisciplinary project, the ”Evolex” project, gathering researchers in Psycholinguistics, Neuropsychology, Computer Science, Natural Language Processing and Linguistics. The Evolex project aims at proposing a new data-based inductive method for automatically characterising the relation between pairs of french words collected in psycholinguistics experiments on lexical access. This method takes advantage of several complementary computational measures of semantic similarity. We show that some measures are more correlated than others with the frequency of lexical associations, and that they also differ in the way they capture different semantic relations. This allows us to consider building a multidimensional lexical similarity to automate the classification of lexical associations.

[1]  Stefan Evert,et al.  Corpora and collocations , 2007 .

[2]  Assaf Urieli,et al.  Robust French syntax analysis: reconciling statistical methods and linguistic knowledge in the Talismane toolkit. (Analyse syntaxique robuste du français : concilier méthodes statistiques et connaissances linguistiques dans l'outil Talismane) , 2013 .

[3]  Emmanuel Navarro,et al.  Skillex: a graph-based lexical score for measuring the semantic efficiency of used verbs by human subjects describing actions , 2014, TAL.

[4]  Alessandro Lenci,et al.  Distributional Memory: A General Framework for Corpus-Based Semantics , 2010, CL.

[5]  Alessandro Lenci,et al.  How we BLESSed distributional semantic evaluation , 2011, GEMS.

[6]  Béla Bollobás,et al.  Modern Graph Theory , 2002, Graduate Texts in Mathematics.

[7]  Brodie Hughes,et al.  MISSILE WOUNDS OF THE BRAIN A Study of Psychological Deficits , 1970 .

[8]  D. Burke,et al.  Word associations in old age: evidence for consistency in semantic encoding during adulthood. , 1986, Psychology and aging.

[9]  Dominique Cardebat,et al.  Verb and noun generation tasks in Huntington's disease , 2004, Movement disorders : official journal of the Movement Disorder Society.

[10]  L. Ferrand,et al.  Normes d'associations verbales pour 366 noms d'objets concrets , 1998 .

[11]  Kenneth Ward Church,et al.  Word Association Norms, Mutual Information, and Lexicography , 1989, ACL.

[12]  Marc Brysbaert,et al.  Lexique 2 : A new French lexical database , 2004, Behavior research methods, instruments, & computers : a journal of the Psychonomic Society, Inc.

[13]  Mathieu Lafourcade,et al.  Making people play for Lexical Acquisition with the JeuxDeMots prototype , 2007 .

[14]  Fanny de la Haye Normes d'associations verbales chez des enfants de 9, 10 et 11 ans et des adultes , 2003 .

[15]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[16]  Jean-Marie Pierrel,et al.  Le trésor de la Langue Française informatisé. Un exemple d'informatisation d'un dictionnaire de langue de référence , 2003 .

[17]  Arthur L. Benton,et al.  Differential behavioral effects in frontal lobe disease , 1968 .

[18]  S. Martin,et al.  Normes d'associations verbales chez des sujets âgés , 2005 .

[19]  Silvia Bernardini,et al.  The WaCky wide web: a collection of very large linguistically processed web-crawled corpora , 2009, Lang. Resour. Evaluation.

[20]  Alessandro Lenci,et al.  Distributional semantics in linguistic and cognitive research , 2008 .

[21]  Felix Hill,et al.  SimLex-999: Evaluating Semantic Models With (Genuine) Similarity Estimation , 2014, CL.

[22]  Reinhard Rapp,et al.  Free Word Associations Correspond to Contiguities Between Words in Texts* , 2005, J. Quant. Linguistics.