Facilitation of multisensory integration by the "unity effect" reveals that speech is special.

Whenever two or more sensory inputs are highly consistent in one or more dimension(s), observers will be more likely to perceive them as a single multisensory event rather than as separate unimodal events. For audiovisual speech, but not for other noncommunicative events, participants exhibit a "unity effect," whereby they are less sensitive to temporal asynchrony (i.e., that are more likely to bind the multisensory signals together) for matched (than for mismatched) speech events. This finding suggests that the modulation of multisensory integration by the unity effect in humans may be specific to speech. To test this hypothesis directly, we investigated whether the unity effect would also influence the multisensory integration of vocalizations from another primate species, the rhesus monkey. Human participants made temporal order judgments for both matched and mismatched audiovisual stimuli presented at a range of stimulus-onset asynchronies. The unity effect was examined with (1) a single call-type across two different monkeys, (2) two different call-types from the same monkey, (3) human versus monkey "cooing," and (4) speech sounds produced by a male and a female human. The results show that the unity effect only influenced participants' performance for the speech stimuli; no effect was observed for monkey vocalizations or for the human imitations of monkey calls. These findings suggest that the facilitation of multisensory integration by the unity effect is specific to human speech signals.

[1]  D. Lewkowicz Infants' perception of the audible, visible, and bimodal attributes of multimodal syllables. , 2000, Child development.

[2]  M. Bindemann,et al.  Faces retain attention , 2005, Psychonomic bulletin & review.

[3]  W. Fitch Vocal tract length and formant frequency dispersion correlate with body size in rhesus macaques. , 1997, The Journal of the Acoustical Society of America.

[4]  Jeffery A. Jones,et al.  Multisensory integration of speech signals: the relationship between space and time , 2006, Experimental Brain Research.

[5]  P. Bertelson,et al.  The psychology of multimodal perception , 2004 .

[6]  Dominic W. Massaro,et al.  The motor theory of speech perception revisited , 2008, Psychonomic bulletin & review.

[7]  Marina Manca,et al.  The decline of , 2000 .

[8]  W. Fitch,et al.  Vocal production in nonhuman primates: Acoustics, physiology, and functional constraints on “honest” advertisement , 1995, American journal of primatology.

[9]  D. Lewkowicz,et al.  The decline of cross-species intersensory perception in human infants. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[10]  D. H. Warren,et al.  The role of visual-auditory “compellingness” in the ventriloquism effect: Implications for transitivity among the spatial senses , 1981, Perception & psychophysics.

[11]  R. D. Easton,et al.  Perceptual dominance during lipreading , 1982, Perception & psychophysics.

[12]  R. Welch Chapter 15 Meaning, attention, and the “unity assumption” in the intersensory bias of spatial and temporal perceptions , 1999 .

[13]  Q. Summerfield Some preliminaries to a comprehensive account of audio-visual speech perception. , 1987 .

[14]  G. J. Thomas Experimental study of the influence of vision on sound localization. , 1941 .

[15]  Mikko Sams,et al.  Audio–visual speech perception is special , 2005, Cognition.

[16]  P. Bertelson,et al.  Adaptation to auditory-visual discordance and ventriloquism in semirealistic situations , 1977 .

[17]  C. Spence,et al.  Crossmodal binding: Evaluating the “unity assumption” using audiovisual speech stimuli , 2007, Perception & psychophysics.

[18]  C. Spence,et al.  Multisensory prior entry. , 2001, Journal of experimental psychology. General.

[19]  P. Marler,et al.  The role of articulation in the production of rhesus monkey, Macaca mulatta, vocalizations , 1993, Animal Behaviour.

[20]  Z. A. Trapeznikova On the Interaction of , 1959 .

[21]  R. Frick Accepting the null hypothesis , 1995, Memory & cognition.

[22]  C. V. Jackson,et al.  Visual Factors in Auditory Localization , 1953 .

[23]  E. Fava Sensation & Perception , 2004, NeuroImage.

[24]  L D Rosenblum,et al.  Visual influences on auditory pluck and bow judgments , 1993, Perception & psychophysics.

[25]  A. King,et al.  Multisensory integration. , 1993, Science.

[26]  R. Welch The advantages and limitations of the psychophysical staircases procedure in the study of intersensory bias: Commentary on Bertelson , 1999 .

[27]  D. Massaro From Multisensory Integration to Talking Heads and Language Learning , 2002 .

[28]  A. Liberman,et al.  The motor theory of speech perception revised , 1985, Cognition.

[29]  N. Troje,et al.  Audiovisual phenomenal causality , 2003, Perception & psychophysics.

[30]  M J Owren,et al.  The role of vocal tract filtering in identity cueing in rhesus monkey (Macaca mulatta) vocalizations. , 1998, The Journal of the Acoustical Society of America.

[31]  H. A. Witkin,et al.  Sound localization with conflicting visual and auditory cues. , 1952, Journal of experimental psychology.

[32]  C. Spence,et al.  Evaluating the influence of the 'unity assumption' on the temporal perception of realistic audiovisual stimuli. , 2008, Acta psychologica.

[33]  D. H. Warren,et al.  Immediate perceptual response to intersensory discrepancy. , 1980, Psychological bulletin.

[34]  N. Logothetis,et al.  Neuroperception: Facial expressions linked to monkey calls , 2003, Nature.

[35]  Joost X. Maier,et al.  Multisensory Integration of Dynamic Faces and Voices in Rhesus Monkey Auditory Cortex , 2005 .

[36]  Roy D. Patterson,et al.  Vocal-Tract Resonances as Indexical Cues in Rhesus Monkeys , 2007, Current Biology.

[37]  D. Lewkowicz Perception of auditory-visual temporal synchrony in human infants. , 1996, Journal of experimental psychology. Human perception and performance.

[38]  Vicki Bruce,et al.  Facial identity and facial speech processing: Familiar faces and voices in the McGurk effect , 1995, Perception & psychophysics.

[39]  David J. Lewkowicz,et al.  Perception of auditory–visual temporal synchrony in human infants. , 1996 .

[40]  C. Spence Audiovisual multisensory integration , 2007 .

[41]  A. Kingstone,et al.  Auditory capture of vision: examining temporal ventriloquism. , 2003, Brain research. Cognitive brain research.

[42]  J. Theeuwes,et al.  Faces capture attention: Evidence from inhibition of return , 2006 .

[43]  P Bertelson,et al.  Auditory-visual interaction and the timing of inputs , 1987, Psychological research.

[44]  L. Bernstein,et al.  Audiovisual Speech Binding: Convergence or Association? , 2004 .

[45]  S. Carey,et al.  Why faces are and are not special: an effect of expertise. , 1986, Journal of experimental psychology. General.

[46]  F. M. Wadley Probit Analysis: a Statistical Treatment of the Sigmoid Response Curve , 1952 .

[47]  Jean Vroomen,et al.  The spatial constraint in intersensory pairing: no role in temporal ventriloquism. , 2006, Journal of experimental psychology. Human perception and performance.

[48]  Roy D Patterson,et al.  The interaction of glottal-pulse rate and vocal-tract length in judgements of speaker size, sex, and age. , 2005, The Journal of the Acoustical Society of America.