Simulating Lexical Semantic Change from Sense-Annotated Data

We present a novel procedure to simulate lexical semantic change from synchronic sense-annotated data, and demonstrate its usefulness for assessing lexical semantic change detection models. The induced dataset represents a stronger correspondence to empirically observed lexical semantic change than previous synthetic datasets, because it exploits the intimate relationship between synchronic polysemy and diachronic change. We publish the data and provide the first large-scale evaluation gold standard for LSC detection models.

[1]  George A. Miller,et al.  Annotating WordNet , 2004, FCP@NAACL-HLT.

[2]  Timothy Baldwin,et al.  Learning Word Sense Distributions, Detecting Unattested Senses and Identifying Novel Senses Using Topic Models , 2014, ACL.

[3]  Katrin Erk,et al.  Deep Neural Models of Semantic Shift , 2018, NAACL-HLT.

[4]  Raquel Fernández,et al.  Semantic Variation in Online Communities of Practice , 2018, IWCS.

[5]  Suzanne Stevenson,et al.  Automatically Identifying Changes in the Semantic Orientation of Words , 2010, LREC.

[6]  Joan Bybee,et al.  Language Change , 2015 .

[7]  Simon Kirby,et al.  Challenges in detecting evolutionary forces in language change using diachronic corpora , 2018, Glossa: a journal of general linguistics.

[8]  Sabine Schulte im Walde,et al.  Distinguishing Literal and Non-Literal Usage of German Particle Verbs , 2016, NAACL.

[9]  Julie Weeds,et al.  Finding Predominant Word Senses in Untagged Text , 2004, ACL.

[10]  Sabine Schulte im Walde,et al.  Improving Verb Metaphor Detection by Propagating Abstractness to Words, Phrases and Individual Senses , 2017 .

[11]  Jure Leskovec,et al.  Cultural Shift or Linguistic Drift? Comparing Two Computational Measures of Semantic Change , 2016, EMNLP.

[12]  Jure Leskovec,et al.  Diachronic Word Embeddings Reveal Statistical Laws of Semantic Change , 2016, ACL.

[13]  David Sanchez,et al.  Dialectometric analysis of language variation in Twitter , 2017, VarDial.

[14]  Daphna Weinshall,et al.  Outta Control: Laws of Semantic Change and Inherent Biases in Word Representation Models , 2017, EMNLP.

[15]  Mirella Lapata,et al.  A Bayesian Model of Diachronic Meaning Change , 2016, TACL.

[16]  Shen Li,et al.  Diachronic Sense Modeling with Deep Contextualized Word Embeddings: An Ecological View , 2019, ACL.

[17]  Dominik Schlechtweg,et al.  Diachronic Usage Relatedness (DURel): A Framework for the Annotation of Lexical Semantic Change , 2018, NAACL.

[18]  Erhard W. Hinrichs,et al.  Extending the TüBa-D/Z Treebank with GermaNet Sense Annotation , 2013, GSCL.

[19]  Steven Skiena,et al.  Statistically Significant Detection of Linguistic Change , 2014, WWW.

[20]  Riccardo Fusaroli,et al.  The emergence of systematicity: How environmental and communicative factors shape a novel communication system , 2018, Cognition.

[21]  Jim Q. Smith,et al.  GASC: Genre-Aware Semantic Change for Ancient Greek , 2019, LChange@ACL.

[22]  Hinrich Schütze,et al.  Automatic Word Sense Discrimination , 1998, Comput. Linguistics.

[23]  David M. Blei,et al.  Dynamic Embeddings for Language Evolution , 2018, WWW.

[24]  Barbara McGillivray,et al.  Room to Glo: A Systematic Comparison of Semantic Change Detection Approaches with Word Embeddings , 2019, EMNLP.

[25]  A. Blank Prinzipien des lexikalischen Bedeutungswandels am Beispiel der romanischen Sprachen , 1997 .

[26]  Thomas Risse,et al.  Finding Individual Word Sense Changes and their Delay in Appearance , 2017, RANLP.

[27]  Dominik Schlechtweg,et al.  A Wind of Change: Detecting and Evaluating Lexical Semantic Change across Times and Domains , 2019, ACL.

[28]  Stefan Hartmann,et al.  Usage context influences the evolution of overspecification in iterated learning , 2017 .

[29]  Udo Hahn,et al.  Bad Company—Neighborhoods in Neural Embedding Spaces Considered Harmful , 2016, COLING.

[30]  Roberto Navigli,et al.  Paving the Way to a Large-scale Pseudosense-annotated Dataset , 2013, HLT-NAACL.

[31]  M. Giulianelli Lexical Semantic Change Analysis with Contextualised Word Representations , 2019 .

[32]  Stephan Mandt,et al.  Dynamic Word Embeddings , 2017, ICML.

[33]  Eyal Sagi,et al.  Semantic Density Analysis: Comparing Word Meaning across Time and Phonetic Space , 2009 .

[34]  Timothy Baldwin,et al.  Novel Word-sense Identification , 2014, COLING.

[35]  Diego Frassinelli,et al.  Quantitative Semantic Variation in the Contexts of Concrete and Abstract Words , 2018, *SEMEVAL.

[36]  Dominik Schlechtweg,et al.  German in Flux: Detecting Metaphoric Change via Word Entropy , 2017, CoNLL.

[37]  M. Pleyer Protolanguage and mechanisms of meaning construal in interaction , 2017 .

[38]  Charles J. Fillmore,et al.  Describing polysemy: the case of 'crawl' , 2000 .

[39]  Simon Hengchen,et al.  Time-Out: Temporal Referencing for Robust Modeling of Lexical Semantic Change , 2019, ACL.

[40]  Jianhua Lin,et al.  Divergence measures based on the Shannon entropy , 1991, IEEE Trans. Inf. Theory.

[41]  Kevin Duh,et al.  A framework for analyzing semantic change of words across time , 2014, IEEE/ACM Joint Conference on Digital Libraries.

[42]  Emanuele Pianta,et al.  Exploiting parallel texts in the creation of multilingual semantically annotated resources: the MultiSemCor Corpus , 2005, Natural Language Engineering.