Electrophysiological Evidence for Early Interaction between Talker and Linguistic Information during Speech Perception

This study combined behavioral and electrophysiological measurements to investigate interactions during speech perception between native phonemes and talker's voice. In a Garner selective attention task, participants either classified each sound as one of two native vowels ([epsilon] and [ae]), ignoring the talker, or as one of two male talkers, ignoring the vowel. The dimension to be ignored was held constant in baseline tasks and changed randomly across trials in filtering tasks. Irrelevant variation in talker produced as much filtering interference (i.e., poorer performance in filtering relative to baseline) in classifying vowels as vice versa, suggesting that the two dimensions strongly interact. Event-related potentials (ERPs) were recorded to identify the processing origin of the interference: an early disruption in extracting dimension-specific information or a later disruption in selecting appropriate responses. Processing in the filtering task was characterized by a sustained negativity starting 100 ms after stimulus onset and peaking 200 ms later. The early onset of this negativity suggests that interference originates in the cognitive effort required by listeners to extract dimension-specific information, a process that precedes response selection. In agreement with these findings, our results revealed numerous dimension-specific effects, most prominently in the filtering tasks.

[1]  C. C. Duncan-Johnson Young Psychophysiologist Award address, 1980. P300 latency: a new metric of information processing. , 1981, Psychophysiology.

[2]  R. Zatorre,et al.  Voice-selective areas in human auditory cortex , 2000, Nature.

[3]  Vince D. Calhoun,et al.  Neuronal chronometry of target detection: Fusion of hemodynamic and event-related potential data , 2005, NeuroImage.

[4]  A. Kleinschmidt,et al.  Modulation of neural responses to speech by directing attention to voices or verbal content. , 2003, Brain research. Cognitive brain research.

[5]  Anne-Lise Giraud,et al.  Distinct functional substrates along the right superior temporal sulcus for the processing of voices , 2004, NeuroImage.

[6]  D. Pisoni,et al.  Speech perception without traditional speech cues. , 1981, Science.

[7]  David A. Medler,et al.  Cerebral Cortex doi:10.1093/cercor/bhi040 Cerebral Cortex Advance Access published February 9, 2005 , 2022 .

[8]  J. McQueen,et al.  The specificity of perceptual learning in speech processing , 2005, Perception & psychophysics.

[9]  G. Lockhead Processing dimensional stimuli: a note. , 1972, Psychological review.

[10]  L. Schiebinger,et al.  Commentary on Risto Naatanen (1990). The role of attention in auditory information processing as revealed by event-related potentials and other brain measures of cognitive fenctiono BBS 13s201-2888 , 1991 .

[11]  Jody Kreiman,et al.  Perception of Voice Quality , 2008 .

[12]  Roy D. Patterson,et al.  The stimulus duration required to identify vowels, their octave, and their pitch chroma , 1995 .

[13]  R. Verleger On the utility of P3 latency as an index of mental chronometry. , 1997, Psychophysiology.

[14]  Paavo Alku,et al.  The auditory N1m reveals the left-hemispheric representation of vowel identity in humans , 2003, Neuroscience Letters.

[15]  W. Sommer,et al.  Postperceptual effects and P300 latency. , 1998, Psychophysiology.

[16]  Scott A. Shappell,et al.  Psychophysiology of N200/N400: A Review and Classification Scheme , 1991 .

[17]  James R. Pomerantz,et al.  Attention and object perception. , 1989 .

[18]  D B Pisoni,et al.  Stimulus variability and processing dependencies in speech perception , 1990, Perception & psychophysics.

[19]  T. Allison,et al.  Electrophysiological Studies of Face Perception in Humans , 1996, Journal of Cognitive Neuroscience.

[20]  J. Polich,et al.  P300 as a clinical assay: rationale, evaluation, and findings. , 2000, International journal of psychophysiology : official journal of the International Organization of Psychophysiology.

[21]  J. Shedden,et al.  Progressive N170 habituation to unattended repeated faces , 2006, Vision Research.

[22]  R. Zatorre,et al.  Adaptation to speaker's voice in right anterior temporal lobe , 2003, Neuroreport.

[23]  H. J. Eysenck,et al.  Advances in psychophysiology: J.R. Jennings, P.K. Ackles & M.G.H. Coles (Eds) Vol.5 (1993).320 pp. £42.50 (hardback). ISBN 185302 191 1 , 1994 .

[24]  K. Alho,et al.  Lateralized automatic auditory processing of phonetic versus musical information: A PET study , 2000, Human brain mapping.

[25]  Linda B. Smith,et al.  A continuum of dimensional separability , 1979, Perception & psychophysics.

[26]  D. Pisoni,et al.  Talker-specific learning in speech perception , 1998, Perception & psychophysics.

[27]  J R Pomerantz,et al.  Electrophysiologic indices of Stroop and Garner interference reveal linguistic influences on auditory and visual processing. , 1997, Journal of the American Academy of Audiology.

[28]  Angela D. Friederici,et al.  Early Parallel Processing of Auditory Word and Voice Information , 2002, NeuroImage.

[29]  Bradford L. Swartz,et al.  Gender Difference in Voice Onset Time , 1992 .

[30]  G. Lockhead Effects of dimensional redundancy on visual discrimination. , 1966, Journal of experimental psychology.

[31]  Shlomo Bentin,et al.  Neural sensitivity to human voices: ERP evidence of task and attentional influences. , 2003, Psychophysiology.

[32]  Asif A. Ghazanfar,et al.  The Role of Temporal Cues in Rhesus Monkey Vocal Recognition: Orienting Asymmetries to Reversed Calls , 2002, Brain, Behavior and Evolution.

[33]  W. R. Garner,et al.  Integrality of stimulus dimensions in various types of information processing , 1970 .

[34]  W. R. Garner The Processing of Information and Structure , 1974 .

[35]  Doris Y. Tsao,et al.  A Cortical Region Consisting Entirely of Face-Selective Cells , 2006, Science.

[36]  T W Picton,et al.  The P300 Wave of the Human Event‐Related Potential , 1992, Journal of clinical neurophysiology : official publication of the American Electroencephalographic Society.

[37]  T. Allison,et al.  Face-sensitive regions in human extrastriate cortex studied by functional MRI. , 1995, Journal of neurophysiology.

[38]  N. Kanwisher,et al.  The M170 is selective for faces, not for expertise , 2005, Neuropsychologia.

[39]  R. C. Oldfield The assessment and analysis of handedness: the Edinburgh inventory. , 1971, Neuropsychologia.

[40]  R Verleger,et al.  Suspense and surprise: on the relationship between expectancies and P3. , 1994, Psychophysiology.

[41]  M. D’Esposito Working memory. , 2008, Handbook of clinical neurology.

[42]  Lawrence E. Marks,et al.  Early-holistic processing or dimensional similarity? , 1993 .

[43]  Linda B. Smith,et al.  Levels of experienced dimensionality in children and adults , 1978, Cognitive Psychology.

[44]  R. Näätänen The role of attention in auditory information processing as revealed by event-related potentials and other brain measures of cognitive function , 1990, Behavioral and Brain Sciences.

[45]  E. Donchin,et al.  On the dependence of P300 latency on stimulus evaluation processes. , 1984, Psychophysiology.

[46]  E. Donchin,et al.  Parsing the late positive complex: mental chronometry and the ERP components that inhabit the neighborhood of the P300. , 2004, Psychophysiology.

[47]  J. Mullennix,et al.  Talker Variability in Speech Processing , 1997 .

[48]  A. Caramazza,et al.  Separable processing of consonants and vowels , 2000, Nature.

[49]  C. C. Wood,et al.  Failure of selective attention to phonetic segments in consonant-vowel syllables , 1975 .

[50]  Friedemann Pulvermüller,et al.  Determinants of dominance: Is language laterality explained by physical or linguistic features of speech? , 2005, NeuroImage.

[51]  G. McCarthy,et al.  Augmenting mental chronometry: the P300 as a measure of stimulus evaluation time. , 1977, Science.

[52]  Kenneth N. Stevens,et al.  Features in Speech Perception and Lexical Access , 2008 .

[53]  S. Geisser,et al.  On methods in the analysis of profile data , 1959 .

[54]  L. Raphael Acoustic Cues to the Perception of Segmental Phonemes , 2008, The Handbook of Speech Perception.

[55]  E. Donchin,et al.  Is the P300 component a manifestation of context updating? , 1988, Behavioral and Brain Sciences.

[56]  P K Kuhl,et al.  The encoding of rate and talker information during phonetic perception , 1997, Perception & psychophysics.

[57]  G. Curio,et al.  Mental chronometry of target detection: human thalamus leads cortex. , 2006, Brain : a journal of neurology.

[58]  N. Kanwisher,et al.  The Fusiform Face Area: A Module in Human Extrastriate Cortex Specialized for Face Perception , 1997, The Journal of Neuroscience.

[59]  J. L. Miller,et al.  Effects of speaking rate and lexical status on phonetic perception. , 1988, Journal of experimental psychology. Human perception and performance.

[60]  Yizhar Lavner,et al.  The effects of acoustic modifications on the identification of familiar voices speaking isolated vowels , 2000, Speech Commun..

[61]  Cristina Romani,et al.  Verbal working memory and sentence comprehension: A multiple-components view. , 1994 .

[62]  M. Hauser The Evolution of Communication , 1996 .

[63]  D. Pisoni,et al.  Effects of stimulus variability on perception and representation of spoken words in memory , 1995, Perception & psychophysics.

[64]  K. R. Ridderinkhof,et al.  Older age, traumatic brain injury, and cognitive slowing: some convergent and divergent findings. , 2002, Psychological bulletin.

[65]  Asif A Ghazanfar,et al.  The auditory behaviour of primates: a neuroethological perspective , 2001, Current Opinion in Neurobiology.

[66]  D Friedman,et al.  A brain event related to the making of a sensory discrimination. , 1979, Science.

[67]  Joanne L. Miller,et al.  Listener sensitivity to individual talker differences in voice-onset-time. , 2004, The Journal of the Acoustical Society of America.

[68]  L. Marks,et al.  Processes underlying dimensional interactions: Correspondences between linguistic and nonlinguistic dimensions , 1990, Memory & cognition.

[69]  Mikko Sams,et al.  Abstract phoneme representations in the left temporal cortex: magnetic mismatch negativity study , 2002, Neuroreport.

[70]  E Donchin,et al.  A metric for thought: a comparison of P300 latency and reaction time. , 1981, Science.

[71]  D. Pisoni,et al.  The Handbook of Speech Perception , 2004 .

[72]  P. Belin,et al.  Thinking the voice: neural correlates of voice perception , 2004, Trends in Cognitive Sciences.

[73]  R. Zatorre,et al.  Human temporal-lobe response to vocal sounds. , 2002, Brain research. Cognitive brain research.

[74]  A. Baddeley Human Memory: Theory and Practice, Revised Edition , 1990 .

[75]  A van Oosterom,et al.  Overlap of attention and movement-related activity in lateralized event-related brain potentials , 2001, Clinical Neurophysiology.

[76]  J. Polich,et al.  Cognitive and biological determinants of P300: an integrative review , 1995, Biological Psychology.

[77]  H G Vaughan,et al.  Differentiation of negative event-related potentials in an auditory discrimination task. , 1990, Electroencephalography and clinical neurophysiology.

[78]  M. Hauser,et al.  The neuroethology of primate vocal communication: substrates for the evolution of speech , 1999, Trends in Cognitive Sciences.