论文信息 - SChME at SemEval-2020 Task 1: A Model Ensemble for Detecting Lexical Semantic Change

SChME at SemEval-2020 Task 1: A Model Ensemble for Detecting Lexical Semantic Change

This paper describes SChME (Semantic Change Detection with Model Ensemble), a method usedin SemEval-2020 Task 1 on unsupervised detection of lexical semantic change. SChME usesa model ensemble combining signals of distributional models (word embeddings) and wordfrequency models where each model casts a vote indicating the probability that a word sufferedsemantic change according to that feature. More specifically, we combine cosine distance of wordvectors combined with a neighborhood-based metric we named Mapped Neighborhood Distance(MAP), and a word frequency differential metric as input signals to our model. Additionally,we explore alignment-based methods to investigate the importance of the landmarks used in thisprocess. Our results show evidence that the number of landmarks used for alignment has a directimpact on the predictive performance of the model. Moreover, we show that languages that sufferless semantic change tend to benefit from using a large number of landmarks, whereas languageswith more semantic change benefit from a more careful choice of landmark number for alignment.

Pin-Yu Chen | Sibel Adali | Maurício Gruppi

[1] Suzanne Stevenson,et al. Automatically Identifying Changes in the Semantic Orientation of Words , 2010, LREC.

[2] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[3] Hui Xiong,et al. Dynamic Word Embeddings for Evolving Semantic Discovery , 2017, WSDM.

[4] David M. Blei,et al. Dynamic Embeddings for Language Evolution , 2018, WWW.

[5] Eyal Sagi,et al. Semantic Density Analysis: Comparing Word Meaning across Time and Phonetic Space , 2009 .

[6] Barbara McGillivray. Tools for historical corpus research , and a corpus of Latin , 2015 .

[7] Stephan Mandt,et al. Dynamic Word Embeddings , 2017, ICML.

[8] Dominik Schlechtweg,et al. CCOHA: Clean Corpus of Historical American English , 2020, LREC.

[9] Jure Leskovec,et al. Diachronic Word Embeddings Reveal Statistical Laws of Semantic Change , 2016, ACL.

[10] Hervé Jégou,et al. Loss in Translation: Learning Bilingual Word Mapping with a Retrieval Criterion , 2018, EMNLP.

[11] Dominik Schlechtweg,et al. A Wind of Change: Detecting and Evaluating Lexical Semantic Change across Times and Domains , 2019, ACL.