The time course of information integration in sentence processing

Recent work in sentence processing has highlighted the distinction between serial and parallel application of linguistic constraints in real time. In looking at context effects in syntactic ambiguity resolution, some studies have reported an immediate influence of semantic and discourse information on syntactic parsing (e.g., McRae, Spivey-Knowlton, & Tanenhaus, 1998; Spivey & Tanenhaus, 1998). However, in looking at the effects of various constraints on grammaticality judgments, some studies have reported a temporal precedence of structural information over semantic information (e.g., McElree & Griffith, 1995, 1998). This chapter points to some computational demonstrations of how an apparent temporal dissociation between structural and non-structural information can in fact arise from the dynamics of the processing system, rather than from its architecture, coupled with the specific parameters of the individual stimuli. A prediction of parallel competitive processing systems is then empirically tested with a new methodology: speeded sentence completions. Results are consistent with a parallel account of the application of linguistic constraints and a competitive account of ambiguity resolution. Introduction For more than a couple of decades now many psycholinguists have been investing a great deal of effort into elucidating the "sequence of stages" involved in the comprehension of language. Emphasis has been placed on the question: When do different information sources (syntax, semantics, etc.) get extracted from the linguistic input? One answer to this question that has been very influential is that the computation of syntax precedes the computation of semantics and pragmatics (e.g., Frazier & Fodor, 1978; Ferreira & Clifton, 1986; McElree & Griffith, 1995, 1998). One opposing answer that is gaining support is that there are no architecturally imposed delays of information during sentence processing, that all relevant information sources are extracted and used the moment they are received as input (MacDonald, Pearlmutter, & Seidenberg, 1994; Spivey-Knowlton & Sedivy, 1995; Trueswell & Tanenhaus, 1994). Recently, however, some disillusionment has been expressed concerning the question itself: "Given the wide range of results that have been reported, it seems most appropriate at the moment to determine the situations in which context does and does not have an influence on parsing, rather than continue the debate of when context has its impact." (Clifton, Frazier, & Rayner, 1994, p.10, italics theirs). Perhaps one way to redirect the "when" question to better understand the mixed results in the literature would be to turn it into a "how" question. Could the manner in which various information sources combine during sentence processing wind up explaining why context sometimes has an early influence and sometimes a late influence? It seems clear that a treatment of this kind of question will require some theoretical constructs and experimental methodologies that are new to sentence processing, as well as some careful attention to lexically-specific variation in stimulus items. The purpose of this chapter is to describe some of these new approaches and the implications that they have for claims about the time course of information integration in sentence processing. Nonlinear Dynamics Over the past fifteen years, a number of researchers have designed dynamical models of sentence processing (Cottrell & Small, 1983; Elman, 1991; McClelland & Kawamoto, 1986; McRae, Spivey-Knowlton & Tanenhaus, 1998; Selman & Hirst, 1985; Spivey & Tanenhaus, 1998; St. John & McClelland, 1990; Tabor & Hutchins, 2000; Tabor, Juliano, & Tanenhaus, 1997; Waltz & Pollack, 1985; Wiles & Elman, 1995; see also Henderson, 1994, and Stevenson, 1993, for hybrid models that combine rule-based systems with some fine-grain temporal dynamics). A dynamical model is a formal model that can be described in terms of how it changes. Typically, such models take the form of a differential equation, dx /dt = ƒ(x ) (Eq. 1) with an initial condition, x = x0. Here x is a vector of several dimensions and t is time. The equation says that the change in x can be computed from the current value of x . The behavior of such systems is often organized around attractors, or stable states (ƒ(x )=0) that the system goes toward from nearby positions. Nearby attractors will tend to have a strong "gravitational pull", and more distant attractors will have a weaker pull. The most common strategy is to assume that initial conditions are determined by the current context (e.g., a string of words like "Alison ran the coffee-grinder") and that attractors correspond to interpretations of that context (e.g. Alison is the agent of a machine-operation event where the machine is a coffee-grinder). The model, (Eq. 1), is called nonlinear if ƒ is a nonlinear function. Nonlinearity is a necessary consequence of having more than one attractor. Since languages contain many sentences with different interpretations (and many partial sentences with different partial interpretations), dynamical models of sentence processing are usually highly nonlinear. The potential for feedback in Equation (1) -the current value of a particular dimension of x can depend on its past value -is also important. It can cause the system to vacillate in a complex manner before settling into an attractor. Many dynamical sentence processing models are implemented in connectionist models (i.e., artificial neural networks). The "neural" activation values correspond to the dimensions of the vector x and the activation update rules correspond (implicitly) to the function, ƒ. In some such cases (e.g., Elman, 1991; St. John & McClelland, 1990; Wiles & Elman, 1995), Equation (1) is replaced by an iterated mapping (Eq. 2): x t+1 = ƒ(x t) (Eq. 2) which makes large discrete, rather than continuous, or approximately continuous, changes in the activation values. Typically, such discrete models are designed so that words are presented to the model one at a time and activation flows in a feedforward manner upon presentation of a single word. This architecture makes no use of the feedback potential of Equation (1), so the dynamics of single word-presentations are trivial; but over the course of several word presentations, activation can flow in circuits around the network, and feedback (as well as input) can contribute significantly to the complexity of the trajectories (Wiles & Elman, 1995). Other proposals allow feedback to cycle after every input presentation. Some such proposals present all the words in a sentence at once (Selman & Hirst, 1985), while others use serial word presentation and allow cycling after each word (Cottrell & Small, 1983; McRae et al., 1998; Spivey & Tanenhaus, 1998; Tabor & Hutchins, 2000; Tabor et al., 1997; Waltz & Pollack, 1985; Wiles & Elman, 1995). Models which allow feedback to cycle after each input make fine-grained predictions about the time course of information integration in sentence processing. In fact, several existing dynamical models of sentence processing exhibit at least simple forms of vacillation. For example, when presented with the string, "Bob threw up dinner", Cottrell and Small (1983)'s model shows a node corresponding to the purposely propel sense of "throw" first gaining and then losing activation (see also Kawamoto, 1993). Tabor et al. (1997) define a dynamical system in which isolated stable states correspond to partial parses of partial strings. At the word "the" in the partial sentence, "A woman insisted the...", for example, they observe a trajectory which curves first toward and then away from an attractor corresponding to the (grammatically impossible) hypothesis that "the" is the determiner of a direct object of "insisted", before reaching an (grammatically appropriate) attractor corresponding to the hypothesis that "the" is the determiner of the subject of an embedded clause. Syntax-first models of sentence processing (Frazier & Fodor, 1978; Frazier, 1987; McElree & Griffith, 1998) are typically designed to restrict vacillation to a very simple form: first one constraint system (syntax) chooses a parse instantaneously and then another one (e.g., semantics) revises it if necessary. In lexical ambiguity resolution, there is evidence for another simple form of vacillation. Tanenhaus, Leiman, and Seidenberg (1979, see also Swinney, 1979, and Kawamoto, 1993), found that ambiguous words exhibit temporary (approx. 200 ms) priming of both meanings (e.g. "rose" as flower and "rose" as moved up) even in a context where only one meaning is appropriate (e.g. "She held the rose"). Soon thereafter, the contextually inappropriate meaning ceases to exhibit priming. Recent constraint-based models of parsing predict effects in syntactic ambiguity resolution that significantly resemble the effects in lexical ambiguity resolution (MacDonald et al., 1994; Spivey & Tanenhaus, 1998; Trueswell & Tanenhaus, 1994). In contrast, typical syntaxfirst models of sentence processing posit syntactic parsing strategies that immediately select a single structural alternative (Frazier & Fodor, 1978; Frazier, 1987). To test these two types of models, what we need are experimental methodologies that provide access to the moment-by-moment representations computed during syntactic parsing. Do we see early vacillation between syntactic alternatives, as is seen between lexical alternatives? In this chapter, we will discuss two experimental methodologies that show promise for revealing the temporal dynamics of syntax-related information during sentence processing: speeded grammaticality judgments (McElree & Griffith, 1995, 1998), and speeded sentence completions. Results from these methodologies are simulated by a nonlinear competition algorithm called Normalized Recurrence (Filip, Tanenhaus, Carlson, Allopenna, & Blatt, this volume; McRae et al., 1998; Spivey & Tanenhaus, 1998; Tanenhaus, Spivey-Knowlton, &

[1]  Michael K. Tanenhaus,et al.  Parsing in a Dynamical System: An Attractor-based Account of the Interaction of Lexical and Structural Constraints in Sentence Processing , 1997 .

[2]  A. H. Kawamoto Nonlinear dynamics in the resolution of lexical ambiguity: A parallel distributed processing account. , 1993 .

[3]  Whitney Tabor,et al.  Mapping the Syntax/Semantics Coastline , 2000 .

[4]  M. Tanenhaus,et al.  Modeling the Influence of Thematic Fit (and Other Constraints) in On-line Sentence Comprehension , 1998 .

[5]  James L. McClelland,et al.  Learning and Applying Contextual Constraints in Sentence Comprehension , 1990, Artif. Intell..

[6]  K.,et al.  Evidence for Multiple Stages in the Processing of Ambiguous Words in Syntactic Contexts , 2004 .

[7]  B. McElree,et al.  Syntactic and Thematic Processing in Sentence Comprehension: Evidence for a Temporal Dissociation , 1995 .

[8]  Michael J. Spivey,et al.  Syntactic ambiguity resolution in discourse: modeling the effects of referential context and lexical frequency. , 1998, Journal of experimental psychology. Learning, memory, and cognition.

[9]  Julie C. Sedivy,et al.  Subject Terms: Linguistics Language Eyes & eyesight Cognition & reasoning , 1995 .

[10]  Michael K. Tanenhaus,et al.  Modeling Thematic and Discourse Context Effects with a Multiple Constraints Approach: Implications for the Architecture of the Language Comprehension System , 1999 .

[11]  Matthew W. Crocker Perspectives on Sentence Processing , 1996 .

[12]  Hana Filip,et al.  Reduced relatives judged hard require constraint-based analyses , 2002 .

[13]  J. Fodor,et al.  The Psychology of Language , 1974 .

[14]  W. Nelson Francis,et al.  FREQUENCY ANALYSIS OF ENGLISH USAGE: LEXICON AND GRAMMAR , 1983 .

[15]  Janet D. Fodor,et al.  The sausage machine: A new two-stage parsing model , 1978, Cognition.

[16]  Suzanne Stevenson,et al.  Competition and recency in a hybrid network model of syntactic disambiguation , 1994 .

[17]  James Henderson,et al.  Connectionist syntactic parsing using temporal variable binding , 1994 .

[18]  C. Clifton,et al.  The independence of syntactic processing , 1986 .

[19]  D. M. Green,et al.  Signal detection theory and psychophysics , 1966 .

[20]  Brian McElree,et al.  Structural and lexical constraints on filling gaps during sentence comprehension: A time-course analysis , 1998 .

[21]  James L. McClelland On the time relations of mental processes: An examination of systems of processes in cascade. , 1979 .

[22]  D. Swinney Lexical access during sentence comprehension: (Re)consideration of context effects , 1979 .

[23]  Paul D. Allopenna,et al.  Tracking the Time Course of Spoken Word Recognition Using Eye Movements: Evidence for Continuous Mapping Models , 1998 .

[24]  G. Seth Psychology of Language , 1968, Nature.

[25]  James Chandler Subroutine STEPIT-finds local minimum of a smooth function of several parameters , 1969 .

[26]  D. Heeger Modeling simple-cell direction selectivity with normalized, half-squared, linear operators. , 1993, Journal of neurophysiology.

[27]  Jordan B. Pollack,et al.  Massively Parallel Parsing: A Strongly Interactive Model of Natural Language Interpretation , 1988, Cogn. Sci..

[28]  Maryellen C. MacDonald,et al.  The lexical nature of syntactic ambiguity resolution , 1994 .

[29]  James L. McClelland,et al.  The TRACE model of speech perception , 1986, Cognitive Psychology.

[30]  James L. McClelland,et al.  Mechanisms of Sentence Processing: Assigning Roles to Constituents of Sentences , 1986 .

[31]  Julie C. Sedivy,et al.  Resolving attachment ambiguities with multiple constraints , 1995, Cognition.