Word associations and the distance properties of context-aware word embeddings

What do people know when they know the meaning of words? Word associations have been widely used to tap into lexical repre- sentations and their structure, as a way of probing semantic knowledge in humans. We investigate whether current word embedding spaces (contextualized and uncontextualized) can be considered good models of human lexi- cal knowledge by studying whether they have comparable characteristics to human associa- tion spaces. We study the three properties of association rank, asymmetry of similarity and triangle inequality. We find that word embeddings are good mod- els of some word associations properties. They replicate well human associations between words, and, like humans, their context-aware variants show violations of the triangle in- equality. While they do show asymmetry of similarities, their asymmetries do not map those of human association norms.

[1]  Anna Korhonen,et al.  Evaluation by Association: A Systematic Study of Quantitative Word Association Evaluation , 2017, EACL.

[2]  Sanja Fidler,et al.  Aligning Books and Movies: Towards Story-Like Visual Explanations by Watching Movies and Reading Books , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[3]  Tomas Mikolov,et al.  Enriching Word Vectors with Subword Information , 2016, TACL.

[4]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[5]  T. Landauer,et al.  A Solution to Plato's Problem: The Latent Semantic Analysis Theory of Acquisition, Induction, and Representation of Knowledge. , 1997 .

[6]  A. Tversky Features of Similarity , 1977 .

[7]  Carlos Ramisch,et al.  Survey: Multiword Expression Processing: A Survey , 2017, CL.

[8]  Thomas A. Schreiber,et al.  The University of South Florida free association, rhyme, and word fragment norms , 2004, Behavior research methods, instruments, & computers : a journal of the Psychonomic Society, Inc.

[9]  Yuanbin Wu,et al.  Exploring Human Gender Stereotypes with Word Association Test , 2019, EMNLP/IJCNLP.

[10]  Drew H. Abney,et al.  Journal of Experimental Psychology : Human Perception and Performance Influence of Musical Groove on Postural Sway , 2015 .

[11]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[12]  Amy Perfors,et al.  Predicting human similarity judgments with distributional models: The value of word associations. , 2016, COLING.

[13]  Omer Levy,et al.  Neural Word Embedding as Implicit Matrix Factorization , 2014, NIPS.

[14]  Rada Mihalcea,et al.  Demographic-aware word associations , 2017, EMNLP.

[15]  A. Tversky,et al.  Representations of qualitative and quantitative dimensions. , 1982, Journal of experimental psychology. Human perception and performance.

[16]  Mark Steyvers,et al.  Topics in semantic representation. , 2007, Psychological review.

[17]  Kenneth Ward Church,et al.  Word Association Norms, Mutual Information, and Lexicography , 1989, ACL.

[18]  Thorsten Joachims,et al.  Evaluation methods for unsupervised word embeddings , 2015, EMNLP.

[19]  Thomas L. Griffiths,et al.  Evaluating Vector-Space Models of Word Representation, or, The Unreasonable Effectiveness of Counting Words Near Other Words , 2017, CogSci.

[20]  Özge Sevgili,et al.  N-Hance at SemEval-2017 Task 7: A Computational Approach using Word Association for Puns , 2017, SemEval@ACL.

[21]  Abel Ekpo-Ufot,et al.  Word Associations , 1978 .

[22]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[23]  Amy Perfors,et al.  The “Small World of Words” English word association norms for over 12,000 cue words , 2018, Behavior Research Methods.

[24]  J. W. Hutchinson,et al.  Nearest neighbor analysis of psychological spaces. , 1986 .

[25]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[26]  Anil Kumar Singh,et al.  SWOW-8500: Word Association task for Intrinsic Evaluation of Word Embeddings , 2019, Proceedings of the 3rd Workshop on Evaluating Vector Space Representations for.