Word Length Frequency and Distribution in English: Observations, Theory and Implications for the Construction of Verse Lines

Recent observations in the theory of verse and empirical metrics have suggested that constructing a verse line involves a pattern-matching search through a source text, and that the number of found elements (complete words totaling a specified number of syllables) is given by dividing the total number of words by the mean number of syllables per word in the source text. This paper makes this latter point explicit mathematically, and in the course of this demonstration shows that the word length frequency totals in English output are distributed geometrically (previous researchers reported an adjusted Poisson distribution), and that the sequential distribution is random at the global level, with significant non-randomness in the fine structure. Data from a corpus of just under two million words, and a syllable-count lexicon of 71,000 word-forms is reported. The pattern-matching theory is shown to be internally coherent, and it is observed that some of the analytic techniques described here form a satisfactory test for regular (isometric) lineation in a text.

[1]  LandscapesByPeter F. Stadlera Towards Theory: , 2021, Philosophy Behind Bars.

[2]  Gabriel Altmann,et al.  The theory of word length: Sone results and generalizations , 1996 .

[3]  Maria Zuse,et al.  Distribution of Word Length in Early Modern English Letters of Sir Philip Sidney , 1996, J. Quant. Linguistics.

[4]  Jutta Frischen Word Length Analysis of Jane Austen's Letters , 1996, J. Quant. Linguistics.

[5]  Arne Ziegler Word Length Distribution in Brazilian-Portuguese Texts , 1996, J. Quant. Linguistics.

[6]  Hagen Riedemann Word-Lengt Distribution in English Press Texts , 1996, J. Quant. Linguistics.

[7]  D. Attridge The rhythms of English poetry , 1981 .

[8]  George Kingsley Zipf,et al.  The Psychobiology of Language , 2022 .

[9]  R. Schiffer Psychobiology of Language , 1986 .

[10]  Carmen Becker Word Lengths in the Letters of the Chilean Author Gabriela Mistral , 1996, J. Quant. Linguistics.

[11]  Derek Attridge,et al.  Poetic Rhythm: An Introduction , 1995 .

[12]  Gabriel Altmann,et al.  Towards a Theory of Word Length Distribution , 1994, J. Quant. Linguistics.

[13]  Winfried Röttger Distribution of Word Length in Ciceronian Letters , 1996, J. Quant. Linguistics.

[14]  Wentian Li,et al.  Random texts exhibit Zipf's-law-like word frequency distribution , 1992, IEEE Trans. Inf. Theory.

[15]  Heike Dittrich,et al.  Word Length Frequency in the Letters of G. E. Lessing , 1996, J. Quant. Linguistics.

[16]  M. Beardsley,et al.  The Concept of Meter: An Exercise in Abstraction , 1959, PMLA/Publications of the Modern Language Association of America.