A convergent gambling estimate of the entropy of English

In his original paper on the subject, Shannon found upper and lower bounds for the entropy of printed English based on the number of trials required for a subject to guess subsequent symbols in a given text. The guessing approach precludes asymptotic consistency of either the upper or lower bounds except for degenerate ergodic processes. Shannon's technique of guessing the next symbol is altered by having the subject place sequential bets on the next symbol of text. If S_{n} denotes the subject's capital after n bets at 27 for 1 odds, and if it is assumed that the subject knows the underlying probability distribution for the process X , then the entropy estimate is \hat{H}_{n}(X)=(1-(1/n) \log_{27}S_{n}) \log_{2} 27 bits/symbol. If the subject does not know the true probability distribution for the stochastic process, then \hat{H}_{n}(X) is an asymptotic upper bound for the true entropy. If X is stationary, E\hat{H}_{n}(X)\rightarrowH(X), H(X) being the true entropy of the process. Moreover, if X is ergodic, then by the Shannon-McMillan-Breiman theorem \hat{H}_{n}(X)\rightarrowH(X) with probability one. Preliminary indications are that English text has an entropy of approximately 1.3 bits/symbol, which agrees well with Shannon's estimate. In his original paper on the subject, Shannon found upper and lower bounds for the entropy of printed English based on the number of trials required for a subject to guess subsequent symbols in a given text. The guessing approach precludes asymptotic consistency of either the upper or lower bounds except for degenerate ergodic processes. Shannon's technique of guessing the next symbol is altered by having the subject place sequential bets on the next symbol of text. If S_{n} denotes the subject's capital after n bets at 27 for 1 odds, and if it is assumed that the subject knows the underlying probability distribution for the process X , then the entropy estimate is \hat{H}_{n}(X)=(1-(1/n) \log_{27}S_{n}) \log_{2} 27 bits/symbol. If the subject does not know the true probability distribution for the stochastic process, then \hat{H}_{n}(X) is an asymptotic upper bound for the true entropy. If X is stationary, E\hat{H}_{n}(X)\rightarrowH(X), H(X) being the true entropy of the process. Moreover, if X is ergodic, then by the Shannon-McMillan-Breiman theorem \hat{H}_{n}(X)\rightarrowH(X) with probability one. Preliminary indications are that English text has an entropy of approximately 1.3 bits/symbol, which agrees well with Shannon's estimate. In his original paper on the subject, Shannon found upper and lower bounds for the entropy of printed English based on the number of trials required for a subject to guess subsequent symbols in a given text. The guessing approach precludes asymptotic consistency of either the upper or lower bounds except for degenerate ergodic processes. Shannon's technique of guessing the next symbol is altered by having the subject place sequential bets on the next symbol of text. If S_{n} denotes the subject's capital after n bets at 27 for 1 odds, and if it is assumed that the subject knows the underlying probability distribution for the process X , then the entropy estimate is \hat{H}_{n}(X)=(1-(1/n) \log_{27}S_{n}) \log_{2} 27 bits/symbol. If the subject does not know the true probability distribution for the stochastic process, then \hat{H}_{n}(X) is an asymptotic upper bound for the true entropy. If X is stationary, E\hat{H}_{n}(X)\rightarrowH(X), H(X) being the true entropy of the process.Moreover, if X is ergodic, then by the Shannon-McMillan-Breiman theorem \hat{H}_{n}(X)\rightarrowH(X) with probability one. Preliminary indications are that English text has an entropy of approximately 1.3 bits/symbol, which agrees well with Shannon's estimate. In his original paper on the subject, Shannon found upper and lower bounds for the entropy of printed English based on the number of trials required for a subject to guess subsequent symbols in a given text. The guessing approach precludes asymptotic consistency of either the upper or lower bounds except for degenerate ergodic processes. Shannon's technique of guessing the next symbol is altered by having the subject place sequential bets on the next symbol of text. If S_{n} denotes the subject's capital after n bets at 27 for 1 odds, and if it is assumed that the subject knows the underlying probability distribution for the process X , then the entropy estimate is \hat{H}_{n}(X)=(1-(1/n) \log_{27}S_{n}) \log_{2} 27 bits/symbol. If the subject does not know the true probability distribution for the stochastic process, then \hat{H}_{n}(X) is an asymptotic upper bound for the true entropy. If X is stationary, E\hat{H}_{n}(X)\rightarrowH(X), H(X) being the true entropy of the process. Moreover, if X is ergodic, then by the Shannon-McMillan-Breiman theorem \hat{H}_{n}(X)\rightarrowH(X) with probability one. Preliminary indications are that English text has an entropy of approximately 1.3 bits/symbol, which agrees well with Shannon's estimate.

[1]  G. A. Miller,et al.  Statistical behavioristics and sequences of responses. , 1949, Psychological review.

[2]  J. Lotz Speech and Language , 1950 .

[3]  G. A. Miller,et al.  Verbal context and the recall of meaningful material. , 1950, The American journal of psychology.

[4]  E. B. Newman Computational methods useful in analyzing series of binary data. , 1951, The American journal of psychology.

[5]  E. B. Newman The pattern of vowels and consonants in various languages. , 1951, The American journal of psychology.

[6]  Claude E. Shannon,et al.  Prediction and Entropy of Printed English , 1951 .

[7]  G. A. Miller,et al.  A statistical description of operant conditioning. , 1951, The American journal of psychology.

[8]  E. B. Newman,et al.  A new method for analyzing printed English. , 1952, Journal of experimental psychology.

[9]  Benoit B. Mandelbrot,et al.  Simpie games of strategy occurring in communication through natural languages , 1954, Trans. IRE Prof. Group Inf. Theory.

[10]  P. Fitts,et al.  The learning of sequential dependencies. , 1954, Journal of experimental psychology.

[11]  A. Chapanis The reconstruction of abbreviated printed messages. , 1954, Journal of experimental psychology.

[12]  G. A. Barnard,et al.  Statistical calculation of word entropies for four Western languages , 1955, IRE Trans. Inf. Theory.

[13]  Ga Miller,et al.  Note on the bias of information estimates , 1955 .

[14]  J. Licklider,et al.  Long-range constraints in the statistical structure of printed English. , 1955, The American journal of psychology.

[15]  Richard C. Pinkerton Information theory and melody. , 1956 .

[16]  Victor H. Yngve Gap analysis and syntax , 1956, IRE Trans. Inf. Theory.

[17]  William F. Schreiber,et al.  The measurement of third order probability distributions of television signals , 1956, IRE Trans. Inf. Theory.

[18]  John L. Kelly,et al.  A new interpretation of information rate , 1956, IRE Trans. Inf. Theory.

[19]  B. S. Ramakrishna,et al.  Relative efficiency of English and German languages for communication of semantic content (Corresp.) , 1958, IRE Trans. Inf. Theory.

[20]  George A. Miller,et al.  Length-Frequency Statistics for Written English , 1958, Inf. Control..

[21]  J. Youngblood Style as Information , 1958 .

[22]  G. Basharin On a Statistical Estimate for the Entropy of a Sequence of Independent Random Variables , 1959 .

[23]  C. Blyth Note on Estimating Information , 1959 .

[24]  Edwin B. Newman,et al.  The Redundancy of Texts in Three Languages , 1960, Inf. Control..

[25]  L. Breiman Optimal Gambling Systems for Favorable Games , 1962 .

[26]  Charles P. Bourne,et al.  A Study of the Statistics of Letters in English Words , 1961, Inf. Control..

[27]  J. A. Hogan Copying redundant messages. , 1961, Journal of Experimental Psychology.

[28]  B. S. Ramakrishna,et al.  Relative Efficiencies of Indian Languages , 1961, Nature.

[29]  R. Shepard Production of constrained associates and the informational uncertainty of the constraint. , 1963, The American journal of psychology.

[30]  H. Bluhme Three-Dimensional Crossword Puzzles in Hebrew , 1963, Inf. Control..

[31]  Gift Siromoney,et al.  Entropy of Tamil Prose , 1963, Inf. Control..

[32]  Eugene S. Schwartz,et al.  A Dictionary for Minimum Redundancy Encoding , 1963, JACM.

[33]  E. Tulving Familiarity of letter-sequences and tachistoscopic identification. , 1963, The American journal of psychology.

[34]  Mario C. Grignetti,et al.  A Note on the Entropy of Words in Printed English , 1964, Inf. Control..

[35]  G. Siromoney,et al.  Style as Information in Karnatic Music , 1964 .

[36]  A. Treisman Verbal responses and contextual constraints in language , 1965 .

[37]  P. Tannenbaum,et al.  Word predictability in the environments of hesitations , 1965 .

[38]  K. R. Rajagopalan A Note on Entropy of Kannada Prose , 1965, Inf. Control..

[39]  J R Parks Prediction and entropy of half-tone pictures. , 1965, Behavioral science.

[40]  William Paisley,et al.  The effects of authorship, topic, structure, and time of composition on letter redundancy in English texts , 1966 .

[41]  E. B. Coleman,et al.  A set of thirty-six prose passages calibrated for complexity , 1967 .

[42]  Michael Kassler,et al.  Character Recognition in Context , 1967, Inf. Control..

[43]  Eugene S. Schwartz,et al.  A Language Element for Compression Coding , 1967, Inf. Control..

[44]  H. E. White Printed english compression by dictionary encoding , 1967 .

[45]  Josef Raviv,et al.  Decision making in Markov chains applied to the problem of pattern recognition , 1967, IEEE Trans. Inf. Theory.

[46]  Dean Jamison,et al.  A Note on the Entropy of Partially-Known Languages , 1968, Inf. Control..

[47]  J. Limb Entropy of quantised television signals , 1968 .

[48]  Ronald W. Cornew,et al.  A Statistical Method of Spelling Correction , 1968, Inf. Control..

[49]  Gift Siromoney,et al.  A Note on Entropy of Telugu Prose , 1968, Inf. Control..

[50]  Nicolaos S. Tzannes,et al.  On Estimating the Entropy of Random Fields , 1970, Inf. Control..

[51]  L. J. Savage Elicitation of Personal Probabilities and Expectations , 1971 .

[52]  D. Mcnicol The confusion of order in short-term memory , 1971 .

[53]  Vítězslav Maixner Some remarks on entropy prediction of natural language texts , 1971, Inf. Storage Retr..

[54]  J. Tuinman,et al.  The Effect of Reducing the Redundancy of Written Messages by Deletion of Function Words , 1972 .

[55]  K. Weltner The Measurement of Verbal Information in Psychology and Education , 1973 .

[56]  A. M. Zubkov Limit Distributions for a Statistical Estimate of the Entropy , 1974 .

[57]  B. Harris The Statistical Estimation of Entropy in the Non-Parametric Case , 1975 .

[58]  M. Wanas,et al.  First second- and third-order entropies of Arabic text (Corresp.) , 1976, IEEE Trans. Inf. Theory.

[59]  Richard Clark Pasco,et al.  Source coding algorithms for fast data compression , 1976 .

[60]  M.E. Hellman,et al.  Privacy and authentication: An introduction to cryptography , 1979, Proceedings of the IEEE.