UWB @ DIACR-Ita: Lexical Semantic Change Detection with CCA and Orthogonal Transformation

In this paper, we describe our method for detection of lexical semantic change (i.e., word sense changes over time) for the DIACR-Ita shared task, where we ranked $1^{st}$. We examine semantic differences between specific words in two Italian corpora, chosen from different time periods. Our method is fully unsupervised and language independent. It consists of preparing a semantic vector space for each corpus, earlier and later. Then we compute a linear transformation between earlier and later spaces, using CCA and Orthogonal Transformation. Finally, we measure the cosines between the transformed vectors.

[1]  Jure Leskovec,et al.  Diachronic Word Embeddings Reveal Statistical Laws of Semantic Change , 2016, ACL.

[2]  Marco Baroni,et al.  A distributional similarity approach to the detection of semantic change in the Google Books Ngram corpus. , 2011, GEMS.

[3]  Tomas Brychcin,et al.  Cross-lingual Word Analogies using Linear Transformations between Semantic Spaces , 2019, Expert Syst. Appl..

[4]  Eneko Agirre,et al.  Generalizing and Improving Bilingual Word Embedding Mappings with a Multi-Step Framework of Linear Transformations , 2018, AAAI.

[5]  Maria Leonor Pacheco,et al.  of the Association for Computational Linguistics: , 2001 .

[6]  Lars Borin,et al.  Survey of Computational Approaches to Lexical Semantic Change , 2018, 1811.06278.

[7]  Milan Straka,et al.  UDPipe 2.0 Prototype at CoNLL 2018 UD Shared Task , 2018, CoNLL.

[8]  Eneko Agirre,et al.  A robust self-learning method for fully unsupervised cross-lingual mappings of word embeddings , 2018, ACL.

[9]  Rada Mihalcea,et al.  Word Epoch Disambiguation: Finding How Words Change Over Time , 2012, ACL.

[10]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[11]  Dominik Schlechtweg,et al.  German in Flux: Detecting Metaphoric Change via Word Entropy , 2017, CoNLL.

[12]  Barbara McGillivray,et al.  SemEval-2020 Task 1: Unsupervised Lexical Semantic Change Detection , 2020, SEMEVAL.

[13]  Erik Velldal,et al.  Diachronic word embeddings and semantic shifts: a survey , 2018, COLING.

[14]  Martin Potthast,et al.  CoNLL 2018 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies , 2018, CoNLL.

[15]  Eneko Agirre,et al.  Learning bilingual word embeddings with (almost) no bilingual data , 2017, ACL.

[16]  Jakub Sido,et al.  UWB at SemEval-2020 Task 1: Lexical Semantic Change Detection , 2020, SEMEVAL.

[17]  Dominik Schlechtweg,et al.  A Wind of Change: Detecting and Evaluating Lexical Semantic Change across Times and Domains , 2019, ACL.

[18]  Alexander Mehler,et al.  On the Linearity of Semantic Change: Investigating Meaning Variation via Dynamic Graph Models , 2016, ACL.

[19]  Thomas Risse,et al.  Finding Individual Word Sense Changes and their Delay in Appearance , 2017, RANLP.

[20]  David Bamman,et al.  Measuring historical word sense variation , 2011, JCDL '11.

[21]  Eneko Agirre,et al.  Learning principled bilingual mappings of word embeddings while preserving monolingual invariance , 2016, EMNLP.

[22]  Tommaso Caselli,et al.  DIACR-Ita @ EVALITA2020: Overview of the EVALITA2020 Diachronic Lexical Semantics (DIACR-Ita) Task , 2020, EVALITA.

[23]  Dominik Schlechtweg,et al.  Simulating Lexical Semantic Change from Sense-Annotated Data , 2020, ArXiv.

[24]  Jure Leskovec,et al.  Cultural Shift or Linguistic Drift? Comparing Two Computational Measures of Semantic Change , 2016, EMNLP.

[25]  Katrin Erk,et al.  Deep Neural Models of Semantic Shift , 2018, NAACL-HLT.

[26]  Mirella Lapata,et al.  A Bayesian Model of Diachronic Meaning Change , 2016, TACL.

[27]  Dominik Schlechtweg,et al.  Diachronic Usage Relatedness (DURel): A Framework for the Annotation of Lexical Semantic Change , 2018, NAACL.

[28]  Christian Biemann,et al.  An automatic approach to identify word sense changes in text media across timescales , 2015, Natural Language Engineering.