UiO-UvA at SemEval-2020 Task 1: Contextualised Embeddings for Lexical Semantic Change Detection

We apply contextualised word embeddings to lexical semantic change detection in the SemEval-2020 Shared Task 1. This paper focuses on Subtask 2, ranking words by the degree of their semantic drift over time. We analyse the performance of two contextualising architectures (BERT and ELMo) and three change detection algorithms. We find that the most effective algorithms rely on the cosine similarity between averaged token embeddings and the pairwise distances between token embeddings. They outperform strong baselines by a large margin (in the post-evaluation phase, we have the best Subtask 2 submission for SemEval-2020 Task 1), but interestingly, the choice of a particular algorithm depends on the distribution of gold scores in the test set.

[1]  Simon Hengchen,et al.  Time-Out: Temporal Referencing for Robust Modeling of Lexical Semantic Change , 2019, ACL.

[2]  Jianhua Lin,et al.  Divergence measures based on the Shannon entropy , 1991, IEEE Trans. Inf. Theory.

[3]  Luke S. Zettlemoyer,et al.  Deep Contextualized Word Representations , 2018, NAACL.

[4]  Dominik Schlechtweg,et al.  Diachronic Usage Relatedness (DURel): A Framework for the Annotation of Lexical Semantic Change , 2018, NAACL.

[5]  Steven Skiena,et al.  Statistically Significant Detection of Linguistic Change , 2014, WWW.

[6]  Erik Velldal,et al.  Temporal dynamics of semantic relations in word embeddings: an application to predicting armed conflict participants , 2017, EMNLP.

[7]  Lars Borin,et al.  Survey of Computational Approaches to Lexical Semantic Change , 2018, 1811.06278.

[8]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[9]  Gemma Boleda,et al.  Short-Term Meaning Shift: A Distributional Exploration , 2018, NAACL.

[10]  Shen Li,et al.  Diachronic Sense Modeling with Deep Contextualized Word Embeddings: An Ecological View , 2019, ACL.

[11]  Stephan Mandt,et al.  Dynamic Word Embeddings , 2017, ICML.

[12]  Elaine Zosa,et al.  Capturing Evolution in Word Usage: Just Add More Clusters? , 2020, WWW.

[13]  Slav Petrov,et al.  Temporal Analysis of Language through Neural Language Models , 2014, LTCSS@ACL.

[14]  Mario Giulianelli,et al.  Analysing Lexical Semantic Change with Contextualised Word Representations , 2020, ACL.

[15]  Milan Straka,et al.  Tokenizing, POS Tagging, Lemmatizing and Parsing UD 2.0 with UDPipe , 2017, CoNLL.

[16]  Jure Leskovec,et al.  Diachronic Word Embeddings Reveal Statistical Laws of Semantic Change , 2016, ACL.

[17]  Delbert Dueck,et al.  Clustering by Passing Messages Between Data Points , 2007, Science.

[18]  Marco Baroni,et al.  A distributional similarity approach to the detection of semantic change in the Google Books Ngram corpus. , 2011, GEMS.

[19]  Barbara McGillivray,et al.  SemEval-2020 Task 1: Unsupervised Lexical Semantic Change Detection , 2020, SEMEVAL.

[20]  Vadim Fomin,et al.  Tracing cultural diachronic semantic shifts in Russian using word embeddings: test sets and baselines , 2019, ArXiv.

[21]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[22]  Murhaf Fares,et al.  Word vectors, reuse, and replicability: Towards a community repository of large-text resources , 2017, NODALIDA.

[23]  Lars Borin,et al.  Survey of Computational Approaches to Diachronic Conceptual Change , 2018, ArXiv.

[24]  Erik Velldal,et al.  Diachronic word embeddings and semantic shifts: a survey , 2018, COLING.