Bimodal Displays Improve Speech Comprehension in Environments with Multiple Speakers

Attending to a single voice when multiple voices are present is a challenging but common occurrence. An experiment was conducted to determine (a) whether presenting a video display of the target speaker aided speech comprehension in an environment with competing voices, and (b) whether the “ventriloquism effect” could be used to enhance comprehension, as found by Driver (1996), using ecologically valid stimuli. Participants listened for target words from videos of an actress reading while simultaneously ignoring the voices of 2 to 4 different actresses. Target-word detection declined as participants had to ignore more distracting voices; however, this decline was reduced when a video display of the target speaker was provided. Neither a signal-detection analysis of performance data nor a gaze-contingent analysis revealed a ventriloquism effect. Providing a video display of a speaker when competing voices are present improves comprehension, but obtaining the ventriloquism effect appears elusive in naturalistic circumstances. Actual or potential applications of this research include those circumstances in which a listener must filter a relevant stream of speech from among multiple, competing voices, such as air traffic control and military environments.

[1]  L. Rosenblum,et al.  An audiovisual test of kinematic primitives for visual speech perception. , 1996, Journal of experimental psychology. Human perception and performance.

[2]  H. McGurk,et al.  Hearing lips and seeing voices , 1976, Nature.

[3]  Robert W. Proctor,et al.  Human factors in simple and complex systems , 1993 .

[4]  J. Blauert Spatial Hearing: The Psychophysics of Human Sound Localization , 1983 .

[5]  T. Wickens Elementary Signal Detection Theory , 2001 .

[6]  J. Driver Enhancement of selective listening by illusory mislocation of speech sounds due to lip-reading , 1996, Nature.

[7]  W R Thurlow,et al.  Effects of degree of visual association and angle of displacement on the "ventriloquism" effect. , 1973, Perceptual and motor skills.

[8]  J MacDonald,et al.  Hearing by Eye: How Much Spatial Degradation can Be Tolerated? , 2000, Perception.

[9]  William A. Yost,et al.  Spatial hearing: The psychophysics of human sound localization, revised edition , 1998 .

[10]  T. Anderson,et al.  Binaural and spatial hearing in real and virtual environments , 1997 .

[11]  S. Handel Listening As Introduction to the Perception of Auditory Events , 1989 .

[12]  P. Barber,et al.  Effect of video frame rate on subjects' ability to shadow one of two competing verbal passages. , 1994, Journal of speech and hearing research.

[13]  A. Macleod,et al.  A procedure for measuring auditory and audio-visual speech-reception thresholds for sentences in noise: rationale, evaluation, and recommendations for use. , 1990, British journal of audiology.

[14]  E. Vatikiotis-Bateson,et al.  Eye movement of perceivers during audiovisualspeech perception , 1998, Perception & psychophysics.

[15]  Paul Bertelson,et al.  Ventriloquism, sensory interaction, and response bias: Remarks on the paper by Choe, Welch, Gilford, and Juola , 1976 .

[16]  D. M. Green,et al.  Sound localization by human listeners. , 1991, Annual review of psychology.

[17]  D. Massaro Perceiving talking faces: from speech perception to a behavioral principle , 1999 .

[18]  W R Thurlow,et al.  Certain Determinants of the “Ventriloquism Effect” , 1973, Perceptual and motor skills.

[19]  D. Massaro,et al.  Perception of Visible Speech: Influence of Spatial Quantization , 1997, Perception.

[20]  Mark D. Lee,et al.  Multichannel Auditory Search: Toward Understanding Control Processes in Polychotic Auditory Listening , 2001, Hum. Factors.