Rapid adaptation in online pragmatic interpretation of contrastive prosody

Rapid adaptation in online pragmatic interpretation of contrastive prosody Chigusa Kurumada, Meredith Brown, Sarah Bibyk, Daniel, F. Pontillo, Michael K. Tanenhaus {ckurumada,mbrown,sbibyk,dpontillo,mtan}@bcs.rochester.edu Dartment of Brain and Cognitive Sciences, University of Rochester Abstract listeners’ categorization functions for /p/ and /b/ can shift af- ter experiencing VOT distributions with more or less variance (Clayards et al., 2008). Recently attempts have been made to extend this logic to explain how listeners navigate syntactic variability to achieve robust and timely sentence processing (e.g., Fine et al., 2013; Kamide, 2012). In the current study we evaluate the hypothesis that the human language comprehension system likewise deals with prosodic variability through sensitivity to statistics in the in- put. Specifically, we ask if listeners adapt their real-time prosodic interpretations to the reliability of prosodic cue val- ues assessed in recent exposure. To this end, we investigate English speaker’s interpretation of an intonation contour that is known to evoke a contrastive interpretation: the contrastive pitch accent (fall-rise: often annotated as L+H* in the ToBI convention (e.g., Silverman et al., 1992)) followed by a ris- ing boundary tone (L-H%). This contour can signal a con- trast between referents (e.g., We have pie L+H∗ L−H% [but no cake]; Ward & Hirschberg, 1985) or predicates (e.g., Lisa HAD L+H∗ the bell L−H% [but she no longer has one]; Den- nison & Schafer, 2010). This intonation contour has two properties that make it well-suited for investigating adaptation in prosody. First, on- line comprehension of the L+H* accent has been studied ex- tensively and it has been shown to trigger immediate eye- movements to visually represented contrast items (e.g., Ito & Speer, 2008, Watson et al., 2008). For example, as soon as hearing L+H* on a color adjective (e.g., “Pick up a blue ball. Now, pick up a YELLOW L+H∗ ...”) listeners fixate color- contrasted items that belong to the same object category as the previous referent. We can examine how recent exposure can modulate such rapid integration of the pitch accent. Sec- ond, while both the pitch accent and the boundary tone con- tribute to the contrastive meaning, their reliability may vary independently. In other words, some speakers may express a contrastive inference primarily through a pitch accent while others may rely more on a boundary tone. One way of navi- gating this variability would be to evaluate each prosodic rep- resentation independently and generalize the information se- lectively to the same type. In our study we test if lowering of the reliability of L+H* would apply specifically to L+H* in the future input, or it would lead to a down-weighting of prosodic information in general. In our previous work (Kurumada, Brown et al., 2012, 2013), we embeded the L+H* – L-H% intonation contour in the English sentence It looks like an X. The L+H* accent was placed on the verb looks, followed by utterance final L- H% (Verb-focus prosody, Figure 1, right). We contrasted this with the same construction pronounced with a canonical ac- The realization of prosody varies across speakers, accents, and speech conditions. Listeners must navigate this variability to converge on consistent prosodic interpretations. We investi- gate whether listeners adapt to speaker-specific realization of prosody based on recent exposure and, if so, whether such adaptation is rapidly integrated with online pragmatic process- ing. We used the visual-world paradigm to investigate effects of prosodic cue reliability on the real-time interpretation of the construction “It looks like an X” pronounced either with (a) a H* pitch accent on the final noun, or (b) a contrastive L+H* pitch accent on looks and a rising boundary tone, a con- tour that can support a complex contrastive inference (e.g., It LOOKS like a zebra...(but it is not)). Eye-movements suggest that listeners process the L+H* on looks as an early cue to a contrastive interpretation. This effect, however, diminished when listeners had been exposed to the same speaker using the L+H* accent infelicitously (e.g., Show me the blue square. Now, show me the BLUE circle). We argue that the process of prosodic interpretations is modulated by the reliability of prosodic cue values, enabling listeners to navigate variability in prosody across speakers and contexts. Keywords: Prosody, contrastive accent, pragmatic inference, eye-tracking, adaptation Introduction Successfully conveying an idea depends not only on what a speaker says, but also how she says it. Prosody – the tonal and rhythmic realization of speech – allows commu- nication of pragmatic meanings and emotions that interact with the lexical and syntactic contents of an utterance (e.g., “YOU shouldn’t say that” vs. “You shouldn’t say THAT”). One long-standing issue in prosody research is how listeners map variable acoustic signals onto underlying prosodic repre- sentations. Prosodic features, such as pitch and duration, vary significantly across different speakers, populations, dialects, and contexts. For example, male voices generally have lower pitch than female voices, and speakers tend to use higher pitch when talking to a baby than to an adult. For listeners to map prosodic feature values onto more abstract prosodic representation (e.g., “high” tones), they therefore must take into account numerous situation-specific factors. The lack of invariance between the acoustic signal and un- derlying linguistic representations is a more general problem in language comprehension. In studies on speech percep- tion, it has been argued that listeners cope with this problem in two ways: by storing exemplars of speech signals (e.g., Goldinger, 1998; Pierrehumbert, 2001) and tracking statisti- cal information about phonetic cue values (e.g., voice onset time (VOT)) in the input. Recent studies have proposed the idea that listeners can assess how reliably each cue predicts the underlying representations, and rapidly adapt their speech perception to more reliable cues in the input (Dell & Chang, 2013; Kleinschmidt & Jaeger, under review). For instance,

[1]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[2]  Julia Hirschberg,et al.  Implicating Uncertainty: The Pragmatics of Fall-Rise Intonation , 1985 .

[3]  Michael K. Tanenhaus,et al.  Interpreting Pitch Accents in Online Comprehension: H* vs. L+H , 2008, Cogn. Sci..

[4]  Yuki Kamide Learning individual talkers’ structural preferences , 2012, Cognition.

[5]  Janet B. Pierrehumbert,et al.  Exemplar dynamics: Word frequency, lenition and contrast , 2000 .

[6]  Mari Ostendorf,et al.  TOBI: a standard for labeling English prosody , 1992, ICSLP.

[7]  Ting Qian,et al.  Rapid Expectation Adaptation during Syntactic Comprehension , 2013, PloS one.

[8]  Gary S Dell,et al.  The P-chain: relating sentence production and its disorders to comprehension and acquisition , 2014, Philosophical Transactions of the Royal Society B: Biological Sciences.

[9]  Julie C. Sedivy,et al.  The effect of speaker-specific information on pragmatic inferences , 2011 .

[10]  Dave F. Kleinschmidt,et al.  Robust speech perception: recognize the familiar, generalize to the similar, and adapt to the novel. , 2015, Psychological review.

[11]  David R. Cox The analysis of binary data , 1970 .

[12]  R. Jacobs,et al.  Perception of speech reflects optimal use of probabilistic speech cues , 2008, Cognition.

[13]  S. Goldinger Echoes of echoes? An episodic theory of lexical access. , 1998, Psychological review.

[14]  Julie C. Sedivy,et al.  Subject Terms: Linguistics Language Eyes & eyesight Cognition & reasoning , 1995 .

[15]  D. Bates,et al.  Linear Mixed-Effects Models using 'Eigen' and S4 , 2015 .

[16]  Chigusa Kurumada,et al.  Pragmatic interpretation of contrastive prosody: It looks like speech adaptation , 2012, CogSci.

[17]  A. Schafer,et al.  Online construction of implicature through contrastive prosody , 2010 .

[18]  Kiwako Ito,et al.  Anticipatory effects of intonation: Eye movements during instructed visual search. , 2008, Journal of memory and language.

[19]  Chigusa Kurumada,et al.  Incremental processing in the pragmatic interpretation of contrastive prosody , 2013, CogSci.