The distribution of information content in English sentences

Sentence is a basic linguistic unit, however, little is known about how information content is distributed across different positions of a sentence. Based on authentic language data of English, the present study calculated the entropy and other entropy-related statistics for different sentence positions. The statistics indicate a three-step staircase-shaped distribution pattern, with entropy in the initial position lower than the medial positions (positions other than the initial and final), the medial positions lower than the final position and the medial positions showing no significant difference. The results suggest that: (1) the hypotheses of Constant Entropy Rate and Uniform Information Density do not hold for the sentence-medial positions; (2) the context of a word in a sentence should not be simply defined as all the words preceding it in the same sentence; and (3) the contextual information content in a sentence does not accumulate incrementally but follows a pattern of "the whole is greater than the sum of parts".

[1]  Claude E. Shannon,et al.  A Mathematical Theory of Communications , 1948 .

[2]  Joan L. Bybee,et al.  Mechanisms of Change in Grammaticization: The Role of Frequency , 2008 .

[3]  Mirjam Ernestus,et al.  Lexical frequency and acoustic reduction in spoken Dutch. , 2005, The Journal of the Acoustical Society of America.

[4]  K. Aaron Smith,et al.  Grammaticalization , 2011, Lang. Linguistics Compass.

[5]  D. Bolinger,et al.  LENGTH, VOWEL, JUNCTURE , 1963 .

[6]  D Granville,et al.  Stochastic Suprasegmentals: Relationships between Redundancy, Prosodic Structure and Syllabic Duration , 1999 .

[7]  Joan L. Bybee,et al.  Frequency of Use and the Organization of Language , 2006 .

[8]  Liam Paninski,et al.  Estimation of Entropy and Mutual Information , 2003, Neural Computation.

[9]  Florien J. van Beinum,et al.  Efficiency as an organizing principle of natural speech , 1998, ICSLP.

[10]  Alice Turk,et al.  The Smooth Signal Redundancy Hypothesis: A Functional Explanation for Relationships between Redundancy, Prosodic Prominence, and Duration in Spontaneous Speech , 2004, Language and speech.

[11]  Peter Grassberger,et al.  Entropy estimation of symbol sequences. , 1996, Chaos.

[12]  M. Haspelmath,et al.  Frequency vs. iconicity in explaining grammatical asymmetries , 2008 .

[13]  Paul J. Hopper,et al.  Introduction to frequency and the emergence of linguistic structure , 2001 .

[14]  Roger Levy,et al.  Speakers optimize information density through syntactic reduction , 2006, NIPS.

[15]  J. M. Pickett,et al.  Intelligibility of Words in Sentences , 1958 .

[16]  P. Ball How words get the message across , 2011 .

[17]  Martin Haspelmath,et al.  Creating economical morphosyntactic patterns in language change , 2008 .

[18]  O. Behaghel Beziehungen zwischen Umfang und Reihenfolge von Satzgliedern. , 1909, Indogermanische Forschungen.

[19]  John Hale,et al.  The Information Conveyed by Words in Sentences , 2003, Journal of psycholinguistic research.

[20]  S. Piantadosi,et al.  Info/information theory: Speakers choose shorter words in predictive contexts , 2013, Cognition.

[21]  Steven T Piantadosi,et al.  Word lengths are optimized for efficient communication , 2011, Proceedings of the National Academy of Sciences.

[22]  Ramon Ferrer-i-Cancho,et al.  Constant conditional entropy and related hypotheses , 2013, ArXiv.

[23]  Haitao Liu The complexity of Chinese syntactic dependency networks , 2008 .

[24]  Jason M. Brenier,et al.  Predictability Effects on Durations of Content and Function Words in Conversational English , 2009 .

[25]  Ricard V. Solé,et al.  Language networks: Their structure, function, and evolution , 2007, Complex..

[26]  François Christophe Egidio Pellegrino,et al.  Across-Language Perspective on Speech Information Rate , 2011 .

[27]  Austin F. Frank,et al.  Speaking Rationally: Uniform Information Density as an Optimal Strategy for Language Production , 2008 .

[28]  Russell S. Tomlin,et al.  Basic Word Order : Functional Principles , 2013 .

[29]  Amy Perfors,et al.  Why are some word orders more common than others? A uniform information density account , 2010, NIPS.

[30]  Jan P. H. van Santen,et al.  Duration and spectral balance of intervocalic consonants: A case for efficient communication , 2005, Speech Commun..

[31]  P. Lieberman Some Effects of Semantic and Grammatical Context on the Production and Perception of Speech , 1963 .

[32]  Dmitrii Manin,et al.  Experiments on predictability of word in context and information rate in natural language , 2006, ArXiv.

[33]  Jeremy H. Clear,et al.  The British national corpus , 1993 .

[34]  Claude E. Shannon,et al.  Prediction and Entropy of Printed English , 1951 .

[35]  Eugene Charniak,et al.  Entropy Rate Constancy in Text , 2002, ACL.

[36]  Thomas L Griffiths,et al.  Rethinking language: How probabilities shape the words we use , 2011, Proceedings of the National Academy of Sciences.