Combining Syntax and Thematic Fit in a Probabilistic Model of Sentence Processing

We present a model of human sentence processing that extends a standard probabilistic grammar model with a semantic module which computes the thematic t of verbs and arguments in a cognitively plausible way. Our model differs from existing probabilistic accounts (e.g., Jurafsky, 1996) by capturing both syntactic and semantic inuences in human sentence processing. It also overcomes limitations of constraint-based models (Spivey-Knowlton, 1996; Narayanan and Jurafsky, 2002), as its parameters can be acquired automatically from corpus data, and no hand-coding of constraints is required. We evaluate our semantic module against human ratings of thematic t, and also test the complete model’s performance for two wellstudied ambiguities from the sentence processing literature.

[1]  Mark Johnson,et al.  Robust probabilistic predictive syntactic processing: motivations, models, and applications , 2001 .

[2]  Andreas Stolcke,et al.  An Efficient Probabilistic Context-Free Parsing Algorithm that Computes Prefix Probabilities , 1994, CL.

[3]  Susan M. Garnsey,et al.  The Contributions of Verb Bias and Plausibility to the Comprehension of Temporarily Ambiguous Sentences , 1997 .

[4]  Daniel Jurafsky,et al.  A Bayesian Model Predicts Human Parse Preference and Reading Times in Sentence Processing , 2001, NIPS.

[5]  Michael K. Tanenhaus,et al.  Modeling Thematic and Discourse Context Effects with a Multiple Constraints Approach: Implications for the Architecture of the Language Comprehension System , 1999 .

[6]  Matthew W. Crocker,et al.  Modular architectures and statistical mechanisms: The case from lexical category disambiguation , 2002 .

[7]  Brian Roark,et al.  Robust Probabilistic Predictive Syntactic Processing , 2001, ArXiv.

[8]  Susan M. Garnsey,et al.  Semantic Influences On Parsing: Use of Thematic Role Information in Syntactic Ambiguity Resolution , 1994 .

[9]  Beatrice Santorini,et al.  Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[10]  Daniel Jurafsky,et al.  A Probabilistic Model of Lexical and Syntactic Access and Disambiguation , 1996, Cogn. Sci..

[11]  M. Tanenhaus,et al.  Modeling the Influence of Thematic Fit (and Other Constraints) in On-line Sentence Comprehension , 1998 .

[12]  Matthew W. Crocker,et al.  Ambiguity Resolution in Sentence Processing: Evidence against Frequency-Based Accounts , 2000 .

[13]  Frank Keller,et al.  Modelling Semantic Role Pausibility in Human Sentence Processing , 2006, EACL.

[14]  M. Brysbaert,et al.  Modifier Attachment in Sentence Parsing: Evidence from Dutch , 1996 .

[15]  Charles J. Fillmore,et al.  The Structure of the Framenet Database , 2003 .

[16]  M W Crocker,et al.  Wide-Coverage Probabilistic Sentence Processing , 2000, Journal of psycholinguistic research.

[17]  John Hale,et al.  A Probabilistic Earley Parser as a Psycholinguistic Model , 2001, NAACL.

[18]  Daniel Jurafsky,et al.  How Verb Subcategorization Frequencies Are Affected By Corpus Choice , 1998, COLING.

[19]  Daniel Gildea,et al.  The Proposition Bank: An Annotated Corpus of Semantic Roles , 2005, CL.

[20]  George A. Miller,et al.  Introduction to WordNet: An On-line Lexical Database , 1990 .

[21]  J. Trueswell THE ROLE OF LEXICAL FREQUENCY IN SYNTACTIC AMBIGUITY RESOLUTION , 1996 .