Zipf's Law and Avoidance of Excessive Synonymy

Zipf's law states that if words of language are ranked in the order of decreasing frequency in texts, the frequency of a word is inversely proportional to its rank. It is very reliably observed in the data, but to date it escaped satisfactory theoretical explanation. This article suggests that Zipf's law may result from a hierarchical organization of word meanings over the semantic space, which in turn is generated by the evolution of word semantics dominated by expansion of meanings and competition of synonyms. A study of the frequency of partial synonyms in Russian provides experimental evidence for the hypothesis that word frequency is determined by semantics.

[1]  George Kingsley Zipf,et al.  Human behavior and the principle of least effort , 1949 .

[2]  Richard B. Dasher,et al.  Regularity in Semantic Change , 2002 .

[3]  H. Simon,et al.  Models of Man. , 1957 .

[4]  Ted Briscoe,et al.  Language learning, power laws, and sexual selection , 2006 .

[5]  David I. Beaver,et al.  The puzzle of ambiguity , 2005 .

[6]  Mirella Lapata,et al.  Dependency-Based Construction of Semantic Space Models , 2007, CL.

[7]  Lahomtoires d'Electronique AN INFORMATIONAL THEORY OF THE STATISTICAL STRUCTURE OF LANGUAGE 36 , 2010 .

[8]  R. Ferrer i Cancho,et al.  The variation of Zipf's law in human language , 2005 .

[9]  Pierre Guiraud,et al.  The semic matrices of meaning , 1968 .

[10]  C. Sparrow The Fractal Geometry of Nature , 1984 .

[11]  Eric S. Wheeler,et al.  Zipf's law and why it works everywhere , 2002, Glottometrics.

[12]  Yuri I. Manin Expanding constructive universes , 1979, Algorithms in Modern Mathematics and Computer Science.

[13]  Ramon Ferrer i Cancho,et al.  Decoding least effort and scaling in signal frequency distributions , 2005 .

[14]  Serge Sharoff,et al.  Meaning as use: exploitation of aligned corpora for the contrastive study of lexical semantics , 2002, LREC.

[15]  David W. Carroll,et al.  Psychology of Language , 1993 .

[16]  Michael Mitzenmacher,et al.  A Brief History of Generative Models for Power Law and Lognormal Distributions , 2004, Internet Math..

[17]  G. Yule,et al.  A Mathematical Theory of Evolution Based on the Conclusions of Dr. J. C. Willis, F.R.S. , 1925 .

[18]  B. Joseph,et al.  Language History, Language Change, and Language Relationship: An Introduction to Historical and Comparative Linguistics , 1996 .

[19]  Ramon Ferrer-i-Cancho,et al.  Hidden communication aspects in the exponent of Zipf's law , 2005, Glottometrics.

[20]  Igor Mel'čnk,et al.  Semantic Description of Lexical Units in an Explanatory Combinatorial Dictionary: Basic Principles and Heuristic Criteria1 , 1988 .

[21]  Wentian Li,et al.  Random texts exhibit Zipf's-law-like word frequency distribution , 1992, IEEE Trans. Inf. Theory.

[22]  S. Levinson Presumptive Meanings: The theory of generalized conversational implicature , 2001 .

[23]  David Crystal,et al.  The Cambridge Encyclopedia of Language , 2012, Modern Language Review.

[24]  Adrian Akmajian,et al.  Linguistics: An Introduction to Language and Communication , 1979 .

[25]  R. Harald Baayen,et al.  Semantic Density and Past-Tense Formation in Three Germanic Languages , 2005 .

[26]  Ramon Ferrer Cancho Hidden communication aspects in the exponent of Zipf's law , 2005 .

[27]  H. Simon,et al.  ON A CLASS OF SKEW DISTRIBUTION FUNCTIONS , 1955 .

[28]  S. Naranan,et al.  Algorithmic information, complexity and Zipf's law , 2002, Glottometrics.