Usage Fluctuation Analysis

Abstract This article introduces a methodology for the diachronic analysis of large historical corpora, Usage Fluctuation Analysis (UFA). UFA looks at the fluctuation of the usage of a word as observed through collocation. It presupposes neither a commitment to a specific semantic theory, nor that the results will focus solely on semantics. We focus, rather, upon a word’s usage. UFA considers large amounts of evidence about usage, through time, as made available by historical corpora, displaying fluctuation in word usage in the form of a graph. The paper provides guidelines for the interpretation of UFA graphs and provides three short case studies applying the technique to (i) the analysis of the word its and (ii) two words related to social actors, whore and harlot. These case studies relate UFA to prior, labour intensive, corpus and historical analyses. They also highlight the novel observations that the technique affords.

[1]  A. McEnery Swearing in English: Bad Language, Purity and Power from 1586 to the Present , 2004 .

[2]  J. R. Firth,et al.  A Synopsis of Linguistic Theory, 1930-1955 , 1957 .

[3]  Attapol Khamkhien,et al.  Lexical Priming: A New Theory of Words and Language , 2013 .

[4]  Peter J. Diggle,et al.  The peaks and troughs of corpus-based contextual analysis. , 2011 .

[5]  Paul Baker Querying Keywords , 2004 .

[6]  Stefan Evert,et al.  Corpora and collocations , 2007 .

[7]  D. Biber,et al.  Longman Grammar of Spoken and Written English , 1999 .

[8]  Ramesh Krishnamurthy,et al.  English Collocation Studies: The OSTI Report , 2004 .

[9]  Howard Rosenbaum,et al.  Effects of reading proficiency on embedded stem priming in primary school children , 2021 .

[10]  K. Gwet Computing inter-rater reliability and its variance in the presence of high agreement. , 2008, The British journal of mathematical and statistical psychology.

[11]  J. Harkins Review of Labov, William (1994) Principles of linguistic change, Volume 1: Internal factors , 1996 .

[12]  Patrick Royston,et al.  Multivariable regression model building by using fractional polynomials: Description of SAS, STATA and R programs , 2006, Computational Statistics & Data Analysis.

[13]  Stephen Clark,et al.  Vector Space Models of Lexical Meaning , 2015 .

[14]  Sung-Hyuk Cha Comprehensive Survey on Distance/Similarity Measures between Probability Density Functions , 2007 .

[15]  Terttu Nevalainen,et al.  Descriptive adequacy of the S-curve model in diachronic studies of language change , 2015 .

[16]  Terttu Nevalainen,et al.  Historical Sociolinguistics: Language Change in Tudor and Stuart England , 2016 .

[17]  Tony McEnery,et al.  Collocations in Corpus‐Based Language Learning Research: Identifying, Comparing, and Interpreting the Evidence , 2017 .

[18]  R. Tibshirani,et al.  Generalized Additive Models , 1986 .

[19]  S. Gries,et al.  Modeling diachronic change in the third person singular: a multifactorial, verb- and author-specific exploratory approach1 , 2010, English Language and Linguistics.

[20]  Erik Velldal,et al.  Diachronic word embeddings and semantic shifts: a survey , 2018, COLING.

[21]  Jerome H. Friedman,et al.  Smoothing of Scatterplots , 1982 .

[22]  Alan Y. Chiang,et al.  Generalized Additive Models: An Introduction With R , 2007, Technometrics.

[23]  Vaclav Brezina,et al.  Statistics in Corpus Linguistics , 2018 .

[24]  A corpus-based investigation into English representations of Turks and Ottomans in the early modern period , 2017 .

[25]  S. Wood Generalized Additive Models: An Introduction with R , 2006 .

[26]  P. Diggle,et al.  Spatial variation in risk of disease: a nonparametric binary regression approach , 2002 .

[27]  Terttu Nevalainen Making the best use of ‘bad’ data: Evidence for sociolinguistic variation in Early Modern English , 1999 .

[28]  Alexander Mehler,et al.  On the Linearity of Semantic Change: Investigating Meaning Variation via Dynamic Graph Models , 2016, ACL.

[29]  Kristy Beers Fägersten Tony McEnery: Swearing in English. Bad Language, Purity and Power from 1586 to the Present. Routledge, 2005. , 2006 .

[30]  C. Tappert,et al.  A Survey of Binary Similarity and Distance Measures , 2010 .

[31]  Martin Hilpert,et al.  Meaning change in a petri dish: constructions, semantic vector spaces, and motion charts , 2015 .

[32]  Tony McEnery,et al.  Collocations in context:a new perspective on collocation networks , 2015 .

[33]  Zellig S. Harris,et al.  Distributional Structure , 1954 .

[34]  Tony McEnery,et al.  The Spoken BNC2014: Designing and building a spoken corpus of everyday conversations , 2017 .

[35]  Merja Kytö,et al.  Early Modern English Dialogues: Spoken Interaction as Writing , 2010 .

[36]  P. Kaszubski Corpora in Applied Linguistics , 2003 .

[37]  Helen Samantha Baker,et al.  Corpus Linguistics and 17th-Century Prostitution: Computational Linguistics and History , 2016 .