Digital begriffsgeschichte: Tracing semantic change using word embeddings

Abstract Recently, the use of word embedding models (WEM) has received ample attention in the natural language processing community. These models can capture semantic information in large corpora of text by learning distributional properties of words, that is how often particular words appear in specific contexts. Scholars have pointed out the potential of WEMs for historical research. In particular, their ability to capture semantic change might assist historians studying conceptual change or specific discursive formations over time. Concurrently, others voiced their criticism and pointed out that WEMs require large amounts of training data, that they are challenging to evaluate, and they lack the specificity looked for by historians. The ability to examine semantic change resonates with the goals of historians such as Reinhart Koselleck, whose research focused on the formation of concepts and the transformation of semantic fields. However, word embeddings can only be used to study particular types of semantic change, and the model’s use is dependent on the size, quality, and bias in training data. In this article, we examine what is required of historical data to produce reliable WEMs, and we describe the types of questions that can be answered using WEMs.

[1]  M. de Rijke,et al.  Ad Hoc Monitoring of Vocabulary Shifts over Time , 2015, CIKM.

[2]  Mitchell P. Marcus,et al.  Low-resource Post Processing of Noisy OCR Output for Historical Corpus Digitisation , 2018, LREC.

[3]  Yves Lepage,et al.  Measuring Similarity from Word Pair Matrices with Syntagmatic and Paradigmatic Associations , 2014, CogALex@COLING.

[4]  James H. Martin,et al.  Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition , 2000 .

[5]  Melvin Wevers,et al.  Using Word Embeddings to Examine Gender Bias in Dutch Newspapers, 1950-1990 , 2019, LChange@ACL.

[6]  Antske Fokkens,et al.  Conceptual Change and Distributional Semantic Models: an Exploratory Study on Pitfalls and Possibilities , 2019, LChange@ACL.

[7]  Reinhard Rapp Syntagmatic and Paradigmatic Associations in Information Retrieval , 2003 .

[8]  Adrian Bingham,et al.  ‘The Digitization of Newspaper Archives: Opportunities and Challenges for Historians’ , 2010 .

[9]  G. Miller,et al.  Contextual correlates of semantic similarity , 1991 .

[10]  Geoffrey Zweig,et al.  Linguistic Regularities in Continuous Space Word Representations , 2013, NAACL.

[11]  Paul Nulty,et al.  Tracing Shifting Conceptual Vocabularies Through Time , 2016, Drift-a-LOD@EKAW.

[12]  M. V. Lange,et al.  Debating Evil: Using Word Embeddings to Analyse Parliamentary Debates on War Criminals in the Netherlands , 2018 .

[13]  Peter de Bolla The Architecture of Concepts: The Historical Formation of Human Rights , 2013 .

[14]  Melvin Richter,et al.  Begriffsgeschichte and the History of Ideas , 1987 .

[15]  Luke S. Zettlemoyer,et al.  Deep Contextualized Word Representations , 2018, NAACL.

[16]  Slav Petrov,et al.  Temporal Analysis of Language through Neural Language Models , 2014, LTCSS@ACL.

[17]  Terrence Szymanski,et al.  Temporal Word Analogies: Identifying Lexical Replacement with Diachronic Word Embeddings , 2017, ACL.

[18]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[19]  Daniel Jurafsky,et al.  Word embeddings quantify 100 years of gender and ethnic stereotypes , 2017, Proceedings of the National Academy of Sciences.

[20]  Omer Levy,et al.  Linguistic Regularities in Sparse and Explicit Word Representations , 2014, CoNLL.

[21]  Evgeniy Gabrilovich,et al.  Concept-Based Information Retrieval Using Explicit Semantic Analysis , 2011, TOIS.

[22]  David Mimno,et al.  Evaluating the Stability of Embedding-based Word Similarities , 2018, TACL.

[23]  David J. Weir,et al.  A critique of word similarity as a method for evaluating distributional semantic models , 2016, RepEval@ACL.

[24]  Georgiana Dinu,et al.  Don’t count, predict! A systematic comparison of context-counting vs. context-predicting semantic vectors , 2014, ACL.

[25]  Daphna Weinshall,et al.  Outta Control: Laws of Semantic Change and Inherent Biases in Word Representation Models , 2017, EMNLP.

[26]  Utpal Garain,et al.  Using Word Embeddings for Automatic Query Expansion , 2016, ArXiv.

[27]  Hui Xiong,et al.  Dynamic Word Embeddings for Evolving Semantic Discovery , 2017, WSDM.

[28]  Anders Søgaard,et al.  Improving historical spelling normalization with bi-directional LSTMs and multi-task learning , 2016, COLING.

[29]  P. E. Jones,et al.  LINEAR ASSOCIATIVE INFORMATION RETRIEVAL , 1962 .

[30]  Jure Leskovec,et al.  Cultural Shift or Linguistic Drift? Comparing Two Computational Measures of Semantic Change , 2016, EMNLP.

[31]  Jure Leskovec,et al.  Diachronic Word Embeddings Reveal Statistical Laws of Semantic Change , 2016, ACL.

[32]  Omer Levy,et al.  Improving Distributional Similarity with Lessons Learned from Word Embeddings , 2015, TACL.

[33]  Antske Fokkens,et al.  Firearms and Tigers are Dangerous, Kitchen Knives and Zebras are Not: Testing whether Word Embeddings Can Tell , 2018, BlackboxNLP@EMNLP.

[34]  M. de Rijke,et al.  A Cross-Language Approach to Historic Document Retrieval , 2006, ECIR.

[35]  Gerard Salton,et al.  A vector space model for automatic indexing , 1975, CACM.

[36]  Philipp Cimiano,et al.  Learning Diachronic Analogies to Analyze Concept Change , 2018, LaTeCH@COLING.

[37]  Bob Nicholson,et al.  THE DIGITAL TURN , 2013 .

[38]  R. Koselleck Social history and conceptual history , 1989 .

[39]  Hans-Peter Frei,et al.  Concept based query expansion , 1993, SIGIR.

[40]  Esslli Site,et al.  Natural Language Processing for Historical Texts , 2012 .

[41]  Reinhart Koselleck,et al.  The Practice of Conceptual History: Timing History, Spacing Concepts , 2002 .

[42]  Hieke Huistra,et al.  Phrasing history: Selecting sources in digital repositories , 2016 .

[43]  Bin Wang,et al.  Evaluating word embedding models: methods and experimental results , 2019, APSIPA Transactions on Signal and Information Processing.

[44]  Claudio Carpineto,et al.  A Survey of Automatic Query Expansion in Information Retrieval , 2012, CSUR.

[45]  Udo Hahn,et al.  JeSemE: Interleaving Semantics and Emotions in a Web Service for the Exploration of Language Change Phenomena , 2018, COLING.

[46]  Reinhart Koselleck,et al.  "Geschichtliche Grundbegriffe. Historisches Lexicon zur politisch-sozialen Sprache in Deutschland", O. Brunner, W. Conze, R. Koselleck, Stuttgart 1978 : [recenzja] / S. R. , 1982 .

[47]  Nick Craswell,et al.  Query Expansion with Locally-Trained Word Embeddings , 2016, ACL.

[48]  Maarten Marx,et al.  UvA-DARE (Digital Academic Repository) Words are Malleable: Computing Semantic Shifts in Political and Media Discourse , 2017 .

[49]  David M. Blei,et al.  Dynamic Embeddings for Language Evolution , 2018, WWW.

[50]  T. Landauer,et al.  Indexing by Latent Semantic Analysis , 1990 .

[51]  Walter Daelemans,et al.  Evaluating Unsupervised Dutch Word Embeddings as a Linguistic Resource , 2016, LREC.

[52]  Roberto Camacho Barranco,et al.  Tracking the Evolution of Words with Time-reflective Text Representations , 2018, 2018 IEEE International Conference on Big Data (Big Data).

[53]  Zellig S. Harris,et al.  Distributional Structure , 1954 .

[54]  Adam Tauman Kalai,et al.  Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings , 2016, NIPS.

[55]  M. Steger,et al.  A Genealogy of ‘Globalization’: The Career of a Concept , 2014 .

[56]  Melvin Wevers,et al.  Design and Implementation of ShiCo: Visualising Shifting Concepts over Time , 2016, HistoInformatics@DH.

[57]  Alessandro Lenci,et al.  The Effects of Data Size and Frequency Range on Distributional Semantic Models , 2016, EMNLP.