The Time Course of Language Change

This paper presents a numeric and information theoretic model for themeasuring of language change, without specifying the particular type ofchange. It is shown that this measurement is intuitively plausibleand that meaningful measurements canbe made from as few as 1000 characters. This measurement techniqueis extended to the task of determining the ``rate'' of language changebased on an examination of brief excerpts from the NationalGeographic Magazine and determining both their linguistic distancefrom one another as well as the number of years of temporal separation.A statistical analysis of these results shows, first, that language changecan be measured, and second, that the rate of languagechange has not been uniform, and that in particular, the period 1939-;1948had particularly slow change, while 1949-;1958 and 1959-;1968 hadparticularly rapid changes.

[1]  Ming Li,et al.  An Introduction to Kolmogorov Complexity and Its Applications , 1997, Texts in Computer Science.

[2]  John F. Burrows,et al.  ‘An ocean where each kind. . .’: Statistical analysis and some major determinants of literary style , 1989, Comput. Humanit..

[3]  Susan Conrad,et al.  Corpus Linguistics: Investigating Language Structure and Use , 1998 .

[4]  Claude E. Shannon,et al.  Prediction and Entropy of Printed English , 1951 .

[5]  Nick Chater,et al.  Representational Distortion, Similarity and the Universal Law of Generalization , 1997 .

[6]  D. Holmes The Evolution of Stylometry in Humanities Scholarship , 1998 .

[7]  John Cocke,et al.  A Statistical Approach to Machine Translation , 1990, CL.

[8]  Roxanna Paez,et al.  Stephen Crane and the New-York Tribune: A Case Study in Traditional and Non-Traditional Authorship Attribution , 2001, Comput. Humanit..

[9]  Robert L. Mercer,et al.  An Estimate of an Upper Bound for the Entropy of English , 1992, CL.

[10]  Eric W. Weisstein,et al.  The CRC concise encyclopedia of mathematics , 1999 .

[11]  Harold L. Somers,et al.  An Attempt to Use Weighted Cusums to Identify Sublanguages , 1998, CoNLL.

[12]  Aleksandr Yakovlevich Khinchin,et al.  Mathematical foundations of information theory , 1959 .

[13]  M. Swadesh Towards Greater Accuracy in Lexicostatistic Dating , 1955, International Journal of American Linguistics.

[14]  Michael D. Alder,et al.  Finding Structure via Compression , 1998, CoNLL.

[15]  Ellen Johnson Lexical Change and Variation in the Southeastern United States, 1930-1990 , 1996 .

[16]  T. Warnow Mathematical approaches to comparative linguistics. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[17]  Benoist,et al.  On the Entropy of DNA: Algorithms and Measurements based on Memory and Rapid Convergence , 1994 .

[18]  A. P. B. Sardinha Corpus linguistics - investigating language structure and use , 1999 .

[19]  Nathan Rosenberg,et al.  Technology and American Economic Growth , 2020 .

[20]  Patrick Juola,et al.  Cross-Entropy and Linguistic Typology , 1998, CoNLL.