Information Density as a Factor for Variation in the Embedding of Relative Clauses

In German, relative clauses can be positioned in-situ or extraposed. A potential factor for the variation might be information density. In this study, this hypothesis is tested with a corpus of 17th century German funeral sermons. For each referent in the relative clauses and their matrix clauses, the attention state was determined (first calculation). In a second calculation, for each word the surprisal values were determined, using a bi-gram language model. In a third calculation, the surprisal values were accommodated as to whether it is the first occurrence of the word in question or not. All three calculations pointed in the same direction: With in-situ relative clauses, the rate of new referents was lower and the average surprisal values were lower, especially the accommodated surprisal values, than with extraposed relative clauses. This indicated that in-formation density is a factor governing the choice between in-situ and extraposed relative clauses. The study also sheds light on the intrinsic relation-ship between the information theoretic concept of information density and in-formation structural concepts such as givenness which are used under a more linguistic perspective.

[1]  Marga Reis,et al.  Zum syntaktischen Status unselbständiger Verbzweit-Sätze , 1997 .

[2]  Mary Hare,et al.  Activating event knowledge , 2009, Cognition.

[3]  Cathrine Fabricius-Hansen,et al.  Information packaging and translation: Aspects of translational sentence splitting (German – English/Norwegian) , 1999 .

[4]  Peter Koch,et al.  Schriftlichkeit und kommunikative Distanz , 2007 .

[5]  J. Trueswell,et al.  Interpreting pronouns and demonstratives in Finnish: Evidence for a form-specific approach to reference resolution , 2008 .

[6]  Claude E. Shannon,et al.  A Mathematical Theory of Communications , 1948 .

[7]  John Hale,et al.  A Probabilistic Earley Parser as a Psycholinguistic Model , 2001, NAACL.

[8]  A. Redder Grammatiktheorie und sprachliches Handeln: "denn" und "da" , 1990 .

[9]  Caroline Féry,et al.  Information structure: Notional distinctions, ways of expression , 2008 .

[10]  P. Auer,et al.  VOM ENDE DEUTSCHER SÄTZE , 1991 .

[11]  Roger Levy,et al.  Speakers optimize information density through syntactic reduction , 2006, NIPS.

[12]  Petra B. Schumacher,et al.  Positional influences on information packaging: Insights from topological fields in German , 2012 .

[13]  Andreas Stolcke,et al.  SRILM - an extensible language modeling toolkit , 2002, INTERSPEECH.

[14]  W. Levelt,et al.  Speaking: From Intention to Articulation , 1990 .

[15]  M. Gernsbacher Mechanisms that improve referential access , 1989, Cognition.

[16]  Markus Steinbach,et al.  Desintegration und Interpretation: Weil-V2-Sätze an der Schnittstelle zwischen Syntax, Semantik und Pragmatik , 2010 .

[17]  Zur Integriertheit kausaler (Neben-)Sätze im Frühneuhochdeutschen , 2011 .

[18]  Edward Gibson,et al.  The processing of extraposed structures in English , 2012, Cognition.

[19]  Elke Teich,et al.  The Royal Society Corpus: Towards a high-quality corpus for studying diachronic variation in scientific writing , 2016, DH.

[20]  Eugene Charniak,et al.  Entropy Rate Constancy in Text , 2002, ACL.

[21]  Matthew W. Crocker,et al.  Information Density and Linguistic Encoding (IDeaL) , 2015, KI - Künstliche Intelligenz.