Implicit Multisensory Associations Influence Voice Recognition

Natural objects provide partially redundant information to the brain through different sensory modalities. For example, voices and faces both give information about the speech content, age, and gender of a person. Thanks to this redundancy, multimodal recognition is fast, robust, and automatic. In unimodal perception, however, only part of the information about an object is available. Here, we addressed whether, even under conditions of unimodal sensory input, crossmodal neural circuits that have been shaped by previous associative learning become activated and underpin a performance benefit. We measured brain activity with functional magnetic resonance imaging before, while, and after participants learned to associate either sensory redundant stimuli, i.e. voices and faces, or arbitrary multimodal combinations, i.e. voices and written names, ring tones, and cell phones or brand names of these cell phones. After learning, participants were better at recognizing unimodal auditory voices that had been paired with faces than those paired with written names, and association of voices with faces resulted in an increased functional coupling between voice and face areas. No such effects were observed for ring tones that had been paired with cell phones or names. These findings demonstrate that brief exposure to ecologically valid and sensory redundant stimulus pairs, such as voices and faces, induces specific multisensory associations. Consistent with predictive coding theories, associative representations become thereafter available for unimodal perception and facilitate object recognition. These data suggest that for natural objects effective predictive signals can be generated across sensory systems and proceed by optimization of functional connectivity between specialized cortical sensory modules.

[1]  A. Young,et al.  Understanding face recognition. , 1986, British journal of psychology.

[2]  R A Johnston,et al.  Understanding face recognition with an interactive activation model. , 1990, British journal of psychology.

[3]  Karl J. Friston,et al.  Statistical parametric maps in functional imaging: A general linear approach , 1994 .

[4]  Michael I. Jordan,et al.  An internal model for sensorimotor integration. , 1995, Science.

[5]  David C. Knill,et al.  Introduction: a Bayesian formulation of visual perception , 1996 .

[6]  Karl J. Friston,et al.  Psychophysiological and Modulatory Interactions in Neuroimaging , 1997, NeuroImage.

[7]  E. Bullmore,et al.  Activation of auditory cortex during silent lipreading. , 1997, Science.

[8]  Dylan M. Jones,et al.  Intra- and inter-modal repetition priming of familiar faces and voices. , 1997, British journal of psychology.

[9]  N. Kanwisher,et al.  The Fusiform Face Area: A Module in Human Extrastriate Cortex Specialized for Face Perception , 1997, The Journal of Neuroscience.

[10]  C J Price,et al.  The neural systems sustaining face and proper-name processing. , 1998, Brain : a journal of neurology.

[11]  M. Mesulam,et al.  From sensation to cognition. , 1998, Brain : a journal of neurology.

[12]  Jenny Hadfield,et al.  I Recognise you but I Can't Place you: An Investigation of Familiar-only Experiences during Tests of Voice and Face Recognition , 1998 .

[13]  R. Cabeza,et al.  Analysis of neural interactions explains the activation of occipital cortex by an auditory stimulus. , 1998, Journal of neurophysiology.

[14]  Hani Yehia,et al.  Quantitative association of vocal-tract and facial behavior , 1998, Speech Commun..

[15]  A. Young,et al.  SIMULATING FACE RECOGNITION: IMPLICATIONS FOR MODELLING COGNITION , 1999 .

[16]  M. Giard,et al.  Auditory-Visual Integration during Multimodal Object Recognition in Humans: A Behavioral and Electrophysiological Study , 1999, Journal of Cognitive Neuroscience.

[17]  Rajesh P. N. Rao,et al.  Predictive coding in the visual cortex: a functional interpretation of some extra-classical receptive-field effects. , 1999 .

[18]  R. Zatorre,et al.  Voice-selective areas in human auditory cortex , 2000, Nature.

[19]  J. Richard Hanley,et al.  Why are Familiar-Only Experiences More Frequent for Voices than for Faces? , 2000, The Quarterly journal of experimental psychology. A, Human experimental psychology.

[20]  S Lehéricy,et al.  The visual word form area: spatial and temporal characterization of an initial stage of reading in normal subjects and posterior split-brain patients. , 2000, Brain : a journal of neurology.

[21]  I. Gauthier,et al.  Expertise for cars and birds recruits brain areas involved in face recognition , 2000, Nature Neuroscience.

[22]  S. Schweinberger,et al.  Neuropsychological Impairments in the Recognition of Faces, Voices, and Personal Names , 2000, Brain and Cognition.

[23]  Mark S. Seidenberg,et al.  Neural Systems Underlying the Recognition of Familiar and Newly Learned Faces , 2000, The Journal of Neuroscience.

[24]  A. Nakamura,et al.  Neural substrates for recognition of familiar voices: a PET study , 2001, Neuropsychologia.

[25]  C. Schroeder,et al.  Somatosensory input to auditory association cortex in the macaque monkey. , 2001, Journal of neurophysiology.

[26]  M. Ernst,et al.  Humans integrate visual and haptic information in a statistically optimal fashion , 2002, Nature.

[27]  H. Kennedy,et al.  Anatomical Evidence of Multimodal Integration in Primate Striate Cortex , 2002, The Journal of Neuroscience.

[28]  Leslie G. Ungerleider,et al.  Visual Imagery of Famous Faces: Effects of Memory and Attention Revealed by fMRI , 2002, NeuroImage.

[29]  R. Campbell,et al.  Reading Speech from Still and Moving Faces: The Neural Substrates of Visible Speech , 2003, Journal of Cognitive Neuroscience.

[30]  Guido Gainotti,et al.  Slowly progressive defect in recognition of familiar people in a patient with right anterior temporal atrophy. , 2003, Brain : a journal of neurology.

[31]  E. Vatikiotis-Bateson,et al.  `Putting the Face to the Voice' Matching Identity across Modality , 2003, Current Biology.

[32]  Ankoor S. Shah,et al.  Auditory Cortical Neurons Respond to Somatosensory Stimulation , 2003, The Journal of Neuroscience.

[33]  D. Neary,et al.  Knowledge of famous faces and names in semantic dementia. , 2004, Brain : a journal of neurology.

[34]  M. Crommelinck,et al.  When audition alters vision: an event-related potential study of the cross-modal interactions between faces and voices , 2004, Neuroscience Letters.

[35]  David B Pisoni,et al.  Specification of cross-modal source information in isolated kinematic displays of speech. , 2004, The Journal of the Acoustical Society of America.

[36]  S. M. Sheffert,et al.  Audiovisual speech facilitates voice learning , 2004, Perception & psychophysics.

[37]  Karl J. Friston,et al.  Where bottom-up meets top-down: neuronal interactions during perception and imagery. , 2004, Cerebral cortex.

[38]  Scott O. Murray,et al.  Perceptual grouping and the interactions between visual cortical areas , 2004, Neural Networks.

[39]  David Poeppel,et al.  Visual speech speeds up the neural processing of auditory speech. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[40]  Andreas Kleinschmidt,et al.  Interaction of Face and Voice Areas during Speaker Recognition , 2005, Journal of Cognitive Neuroscience.

[41]  Joost X. Maier,et al.  Multisensory Integration of Dynamic Faces and Voices in Rhesus Monkey Auditory Cortex , 2005 .

[42]  Micah M. Murray,et al.  The brain uses single-trial multisensory memories to discriminate without awareness , 2005, NeuroImage.

[43]  Richard E. Turner,et al.  The processing and perception of size information in speech sounds. , 2005, The Journal of the Acoustical Society of America.

[44]  P. Barone,et al.  Heteromodal connections supporting multisensory integration at low levels of cortical processing in the monkey , 2005, The European journal of neuroscience.

[45]  M. Murray,et al.  The role of multisensory memories in unisensory object discrimination. , 2005, Brain research. Cognitive brain research.

[46]  Karl J. Friston,et al.  A theory of cortical responses , 2005, Philosophical Transactions of the Royal Society B: Biological Sciences.

[47]  Anne-Lise Giraud,et al.  Voice recognition and cross-modal responses to familiar speakers' voices in prosopagnosia. , 2006, Cerebral cortex.

[48]  Takashi Tsukiura,et al.  Dissociable roles of the bilateral anterior temporal lobe in face−name associations: An event-related fMRI study , 2006, NeuroImage.

[49]  J. Tenenbaum,et al.  Special issue on “Probabilistic models of cognition , 2022 .

[50]  A. Yuille,et al.  Opinion TRENDS in Cognitive Sciences Vol.10 No.7 July 2006 Special Issue: Probabilistic models of cognition Vision as Bayesian inference: analysis by synthesis? , 2022 .

[51]  Cindy M. Bukach,et al.  Beyond faces and modularity: the power of an expertise framework , 2006, Trends in Cognitive Sciences.

[52]  Sandeep Prasada,et al.  Principled and statistical connections in common sense conception , 2006, Cognition.

[53]  A. Ghazanfar,et al.  Is neocortex essentially multisensory? , 2006, Trends in Cognitive Sciences.

[54]  D. Poeppel,et al.  Temporal window of integration in auditory-visual speech perception , 2007, Neuropsychologia.