Corpus evidence and the role of probability estimates in processing decisions

An exception to a non-categorical generalization consists of a lexical item that exhibits the general pattern at a rate radically different – either far higher or far lower – from the norm. Lexical differences in noun phrases containing non-subject relative clauses (NSRCs) correlate with large differences in the likelihood that the NSRC will begin with that. In particular, the choices of determiner, head noun, and prenominal adjective in an NP containing an NSRC may dramatically raise or lower rates of that in the NSRC. These lexical variations can be partially explained in terms of predictability: more predictable NSRCs are less likely to begin with that. This generalization can be plausibly explained in terms of processing, assuming that facilitates processing and/or signals difficulty. The correlations between lexical choices in the NP and the predictability of an NSRC can, in turn, be explained in terms of the semantics of the lexical items and the pragmatics of reference.

[1]  Robert F. Port,et al.  Dynamics of Language , 2009, Encyclopedia of Complexity and Systems Science.

[2]  J. Bresnan,et al.  The Gradience of the Dative Alternation , 2008 .

[3]  Barbara A. Fox,et al.  Relative Clauses in English conversation Relativizers , frequency , and the notion of construction * , 2005 .

[4]  Roger Levy,et al.  Speakers optimize information density through syntactic reduction , 2006, NIPS.

[5]  Nancy C Kula,et al.  Syntactic and phonological phrasing in Bemba Relatives , 2006 .

[6]  M. Landy,et al.  Optimal Compensation for Changes in Task-Relevant Movement Variability , 2005, The Journal of Neuroscience.

[7]  Thomas Wasow,et al.  Processing as a Source of Accessibility Effects on Variation , 2005 .

[8]  T. Florian Jaeger,et al.  Optional that indicates production difficulty: evidence from disfluencies , 2005, DiSS.

[9]  Susan M. Garnsey,et al.  Knowledge of Grammar, Knowledge of Usage: Syntactic Probabilities Affect Pronunciation Variation , 2004 .

[10]  Alice Turk,et al.  The Smooth Signal Redundancy Hypothesis: A Functional Explanation for Relationships between Redundancy, Prosodic Prominence, and Duration in Spontaneous Speech , 2004, Language and speech.

[11]  J. Hawkins Efficiency and complexity in grammars , 2004 .

[12]  Dan Jurafsky,et al.  Effects of disfluencies, predictability, and utterance position on word form variation in English conversation. , 2003, The Journal of the Acoustical Society of America.

[13]  H. Hughes The Cambridge Grammar of the English Language , 2003 .

[14]  Maryellen C. MacDonald,et al.  The use of "that" in the Production and Comprehension of Object Relative Clauses , 2003 .

[15]  Louis C. W. Pols,et al.  How efficient is speech , 2003 .

[16]  G. Dell,et al.  Effect of Ambiguity and Lexical Availability on Syntactic and Lexical Production , 2000, Cognitive Psychology.

[17]  Daniel Kersten,et al.  High-level Vision as Statistical Inference , 1999 .

[18]  Joshua B. Tenenbaum,et al.  Bayesian Modeling of Human Concept Learning , 1998, NIPS.

[19]  Adwait Ratnaparkhi,et al.  A Simple Introduction to Maximum Entropy Models for Natural Language Processing , 1997 .

[20]  John R. Anderson The Adaptive Character of Thought , 1990 .

[21]  岩田 一男,et al.  Essentials of English grammar , 1972 .

[22]  Noam Chomsky,et al.  Topics in the Theory of Generative Grammar , 1966 .