Auditory Speech Processing is Affected by Visual Speech in the Periphery

Two experiments were conducted to determine whether visual speech presented in the visual periphery affects the perceived identity of speech sounds. Auditory speech targets (vCv syllables) were presented in noise (-8 dB) with congruent or incongruent visual speech presented in full-face or upper-half face conditions. Participants’ eye-movements were monitored to assure that visual speech input occurred only from the periphery. In Experiment 1 participants had only to identify what they heard. The results showed that peripherally presented visual speech (full-face) facilitated identification of AV congruent stimuli compared to the upper-face control. Likewise, visual speech reduced correct identification for the incongruent stimuli. Experiment 2 was the same as the first except that in addition participants performed a central visual task. Once again significant effects of visual speech were found. These results show that peripheral visual speech affects speech recognition.

[1]  C. Koch,et al.  Faces and text attract gaze independent of the task: Experimental data and computer model. , 2009, Journal of vision.

[2]  Bhuvanesh Awasthi,et al.  Processing of low spatial frequency faces at periphery in choice reaching tasks , 2011, Neuropsychologia.

[3]  G. Plant Perceiving Talking Faces: From Speech Perception to a Behavioral Principle , 1999 .

[4]  E. Louie,et al.  Holistic crowding: selective interference between configural representations of faces in crowded scenes. , 2007, Journal of vision.

[5]  K. G. Munhall,et al.  Spatial frequency requirements for audiovisual speech perception , 2004, Perception & psychophysics.

[6]  Salvador Soto-Faraco,et al.  Attention to touch weakens audiovisual speech integration , 2007, Experimental Brain Research.

[7]  Steven Greenberg,et al.  Effects of Spectro-Temporal Asynchrony in Auditory and Auditory-Visual Speech Processing , 2004 .

[8]  H. McGurk,et al.  Hearing lips and seeing voices , 1976, Nature.

[9]  Jeffrey B. Nyquist,et al.  Spatial and temporal limits of motion perception across variations in speed, eccentricity, and low vision. , 2009, Journal of vision.

[10]  Gérard Bailly,et al.  Gaze, conversational agents and face-to-face communication , 2010, Speech Commun..

[11]  J. Navarra,et al.  Assessing automaticity in audiovisual speech integration: evidence from the speeded classification task , 2004, Cognition.

[12]  Frédéric Berthommier,et al.  Binding and unbinding in audiovisual speech fusion: removing the Mcgurk effect by an incoherent preceding audiovisual context , 2010, AVSP.