Speaker-specific processing and local context information: The case of speaking rate

To deal with variation in the speech signal, listeners rely on local context, such as speaking rate in a carrier sentence directly preceding a target, as well as more global properties of the speech signal, such as speaker-specific pronunciation variants. The present study addressed whether, despite its variability even within one speaker, habitual speaking rate can be tracked as a speaker-specific property and how such speaker-specific tracking of habitual rate would interact with effects of local-rate normalization. In two experiments, listeners were exposed to a 2-min dialogue between a fast and a slow speaker. At test, listeners categorized minimal word pair continua differing in the German /a/-/a:/ duration contrast spoken by the same two speakers. The results showed that listeners responded with /a:/ more often for the fast speaker but only when words were presented in isolation and not when presented with additional local-rate information. That is, despite the general assumption that duration cues and speaking rate are too variable to be used in a speaker-specific fashion, tracking habitual speaking rate may help speech perception. The results are discussed in relation to a belief-updating model of perceptual adaptation and exemplar models.

[1]  Keith Johnson,et al.  Resonance in an exemplar-based lexicon: The emergence of social identity and phonology , 2006, J. Phonetics.

[2]  T. Crystal,et al.  Segmental durations in connected‐speech signals: Current results , 1988 .

[3]  Julie M. Hupp,et al.  Speech priming: Evidence for rate persistence in unscripted speech , 2009 .

[4]  Dave F. Kleinschmidt,et al.  Robust speech perception: recognize the familiar, generalize to the similar, and adapt to the novel. , 2015, Psychological review.

[5]  P K Kuhl,et al.  The encoding of rate and talker information during phonetic perception , 1997, Perception & psychophysics.

[6]  J. Mullennix,et al.  Talker Variability in Speech Processing , 1997 .

[7]  Laura C. Dilley,et al.  Long-Term Temporal Tracking of Speech Rate Affects Spoken-Word Recognition , 2014, Psychological science.

[8]  J. L. Miller,et al.  Effects of speaking rate and lexical status on phonetic perception. , 1988, Journal of experimental psychology. Human perception and performance.

[9]  Joan L. Bybee,et al.  Frequency and the emergence of linguistic structure , 2001 .

[10]  Eva Reinisch,et al.  The uptake of spectral and temporal cues in vowel perception is rapidly influenced by context , 2013, J. Phonetics.

[11]  Matthias J. Sjerps,et al.  1 Divide and conquer : how perceptual contrast sensitivity and perceptual learning cooperate in reducing input variation in speech perception , 2015 .

[12]  Margaret Wilson,et al.  An oscillator model of the timing of turn-taking , 2005, Psychonomic bulletin & review.

[13]  S. Goldinger Echoes of echoes? An episodic theory of lexical access. , 1998, Psychological review.

[14]  Joanne L. Miller,et al.  Listener sensitivity to individual talker differences in voice-onset-time. , 2004, The Journal of the Acoustical Society of America.

[15]  Rachel M. Theodore,et al.  Individual talker differences in voice-onset-time: contextual influences. , 2009, The Journal of the Acoustical Society of America.

[16]  J. McQueen,et al.  Changing only the probability that spoken words will be distorted changes how they are recognized. , 2012, Journal of the Acoustical Society of America.

[17]  Tessa Bent,et al.  Perceptual adaptation to non-native speech , 2008, Cognition.

[18]  B. Wright,et al.  Accent-independent adaptation to foreign accented speech. , 2013, The Journal of the Acoustical Society of America.

[19]  J. McQueen,et al.  Use what you can: storage, abstraction processes, and perceptual adjustments help listeners recognize reduced forms , 2014, Front. Psychol..

[20]  A. Simpson,et al.  Acoustic analysis of German vowels in the Kiel Corpus of Read Speech , 1997 .

[21]  L. Lisker,et al.  A Cross-Language Study of Voicing in Initial Stops: Acoustical Measurements , 1964 .

[22]  Laura C. Dilley,et al.  Altering Context Speech Rate Can Cause Words to Appear or Disappear , 2010, Psychological science.

[23]  Hugo Quené,et al.  Multilevel modeling of between-speaker and within-speaker variation in spontaneous speech tempo. , 2008, The Journal of the Acoustical Society of America.

[24]  D. Pisoni,et al.  Speech Perception as a Talker-Contingent Process , 1993, Psychological science.

[25]  S. Goldinger Words and voices: episodic traces in spoken word identification and recognition memory. , 1996, Journal of experimental psychology. Learning, memory, and cognition.

[26]  G. Kidd Articulatory-rate context effects in phoneme identification. , 1989, Journal of experimental psychology. Human perception and performance.

[27]  A. Liberman,et al.  Some effects of later-occurring information on the perception of stop consonant and semivowel , 1979, Perception & psychophysics.

[28]  J. Sawusch,et al.  Perceptual normalization for speaking rate II: Effects of signal discontinuities , 2000, Perception & psychophysics.

[29]  Rochelle S. Newman,et al.  Perceptual normalization for speaking rate III: Effects of the rate of one voice on perception of another , 2009, J. Phonetics.

[30]  Susanne Brouwer,et al.  Speech reductions change the dynamics of competition during spoken word recognition , 2012 .

[31]  David DeSteno,et al.  Individual talker differences in voice-onset-time. , 2003, The Journal of the Acoustical Society of America.

[32]  Thomas A. Farmer,et al.  Prediction, explanation, and the role of generative models in language processing. , 2013, The Behavioral and brain sciences.

[33]  Hugo Quené,et al.  Longitudinal trends in speech tempo: the case of Queen Beatrix. , 2013, The Journal of the Acoustical Society of America.

[34]  T. Gay Effect of speaking rate on vowel formant movements. , 1978, The Journal of the Acoustical Society of America.

[35]  T H Crystal,et al.  Segmental durations in connected speech signals: preliminary results. , 1982, The Journal of the Acoustical Society of America.

[36]  Q. Summerfield Articulatory rate and perceptual constancy in phonetic perception. , 1981, Journal of experimental psychology. Human perception and performance.

[37]  A. Clark Whatever next? Predictive brains, situated agents, and the future of cognitive science. , 2013, The Behavioral and brain sciences.

[38]  Alexandra Jesse,et al.  Speaking Rate Affects the Perception of Duration as a Suprasegmental Lexical-stress Cue , 2011, Language and speech.

[39]  A. Ellis Progress in the psychology of language , 1985 .

[40]  L. Lisker,et al.  Some Effects of Context On Voice Onset Time in English Stops , 1967, Language and speech.

[41]  J L Miller,et al.  The influence of sentential speaking rate on the internal structure of phonetic categories. , 1994, The Journal of the Acoustical Society of America.

[42]  R. Hyman Quasi-Experimentation: Design and Analysis Issues for Field Settings (Book) , 1982 .

[43]  J. L. Miller,et al.  Articulation Rate and Its Variability in Spontaneous Speech: A Reanalysis and Some Implications , 1984, Phonetica.

[44]  D. Pisoni,et al.  Talker-specific learning in speech perception , 1998, Perception & psychophysics.

[45]  T. Cook,et al.  Quasi-experimentation: Design & analysis issues for field settings , 1979 .

[46]  Paul Boersma,et al.  Praat: doing phonetics by computer , 2003 .

[47]  Sarah C. Creel,et al.  On-line acoustic and semantic interpretation of talker information , 2011 .

[48]  D. Norris,et al.  Perceptual learning in speech , 2003, Cognitive Psychology.

[49]  P. Ladefoged,et al.  Phonetic linguistics : essays in honor of Peter Ladefoged , 1987 .

[50]  J. Sawusch,et al.  Perceptual normalization for speaking rate: Effects of temporal distance , 1996, Perception & psychophysics.

[51]  D. Barr,et al.  Random effects structure for confirmatory hypothesis testing: Keep it maximal. , 2013, Journal of memory and language.

[52]  Janet B. Pierrehumbert,et al.  Exemplar dynamics: Word frequency, lenition and contrast , 2000 .

[53]  J. McQueen,et al.  Speaking rate from proximal and distal contexts is used during word segmentation. , 2011, Journal of experimental psychology. Human perception and performance.

[54]  A. Samuel,et al.  Perceptual adjustments to multiple speakers , 2007 .

[55]  G Weismer,et al.  Interspeaker variation in habitual speaking rate: evidence for a neuromuscular component. , 1997, Journal of speech, language, and hearing research : JSLHR.

[56]  Jacques Koreman,et al.  Perceived speech rate: the effects of articulation rate and speaking style in spontaneous speech. , 2006, The Journal of the Acoustical Society of America.