How Visual Cues for when to Listen Aid Selective Auditory Attention

Visual cues are known to aid auditory processing when they provide direct information about signal content, as in lip reading. However, some studies hint that visual cues also aid auditory perception by guiding attention to the target in a mixture of similar sounds. The current study directly tests this idea for complex, nonspeech auditory signals, using a visual cue providing only timing information about the target. Listeners were asked to identify a target zebra finch bird song played at a random time within a longer, competing masker. Two different maskers were used: noise and a chorus of competing bird songs. On half of all trials, a visual cue indicated the timing of the target within the masker. For the noise masker, the visual cue did not affect performance when target and masker were from the same location, but improved performance when target and masker were in different locations. In contrast, for the chorus masker, visual cues improved performance only when target and masker were perceived as coming from the same direction. These results suggest that simple visual cues for when to listen improve target identification by enhancing sounds near the threshold of audibility when the target is energetically masked and by enhancing segregation when it is difficult to direct selective attention to the target. Visual cues help little when target and masker already differ in attributes that enable listeners to engage selective auditory attention effectively, including differences in spectrotemporal structure and in perceived location.

[1]  Michael A Akeroyd,et al.  The Effects of Cueing Temporal and Spatial Attention on Word Recognition in a Complex Listening Task in Hearing-Impaired Listeners , 2008, Trends in amplification.

[2]  Lynne E. Bernstein,et al.  Auditory speech detection in noise enhanced by lipreading , 2004, Speech Commun..

[3]  Gerald Kidd,et al.  Tuning in the spatial dimension: evidence from a masked speech identification task. , 2008, The Journal of the Acoustical Society of America.

[4]  P F Seitz,et al.  The use of visible speech cues for improving auditory detection of spoken sentences. , 2000, The Journal of the Acoustical Society of America.

[5]  J. Myerson,et al.  Cross-Modal Enhancement of Speech Detection in Young and Older Adults: Does Signal Content Matter? , 2011, Ear and hearing.

[6]  Virginia Best,et al.  Spatial unmasking of birdsong in human listeners: energetic and informational factors. , 2005, The Journal of the Acoustical Society of America.

[7]  H S Colburn,et al.  Reducing informational masking by sound segregation. , 1994, The Journal of the Acoustical Society of America.

[8]  Virginia Best,et al.  Visually-guided Attention Enhances Target Identification in a Complex Auditory Scene , 2007, Journal for the Association for Research in Otolaryngology.

[9]  B. Shinn-Cunningham Object-based auditory and visual attention , 2008, Trends in Cognitive Sciences.

[10]  Ira J. Hirsh,et al.  The Relation between Localization and Intelligibility , 1950 .

[11]  Sidney S. Simon,et al.  Merging of the Senses , 2008, Front. Neurosci..

[12]  C. Mason,et al.  Release from masking due to spatial separation of sources in the identification of nonspeech auditory patterns. , 1998, The Journal of the Acoustical Society of America.

[13]  W. H. Sumby,et al.  Visual contribution to speech intelligibility in noise , 1954 .

[14]  Douglas S Brungart,et al.  The effects of spatial separation in distance on the informational and energetic masking of a nearby speech signal. , 2002, The Journal of the Acoustical Society of America.

[15]  D M Green,et al.  Detection of temporally uncertain signals. , 1980, The Journal of the Acoustical Society of America.

[16]  Lori J. Leibold,et al.  The effect of signal-temporal uncertainty on detection in bursts of noise or a random-frequency complex. , 2008, The Journal of the Acoustical Society of America.

[17]  R L Freyman,et al.  The role of perceived spatial separation in the unmasking of speech. , 1999, The Journal of the Acoustical Society of America.

[18]  R. Freyman,et al.  The role of visual speech cues in reducing energetic and informational masking. , 2005, The Journal of the Acoustical Society of America.

[19]  A. Bronkhorst,et al.  Multichannel speech intelligibility and talker recognition using monaural, binaural, and three-dimensional auditory presentation. , 2000, The Journal of the Acoustical Society of America.

[20]  G. Kidd,et al.  The effect of spatial separation on informational and energetic masking of speech. , 2002, The Journal of the Acoustical Society of America.

[21]  Barbara G Shinn-Cunningham,et al.  Localizing nearby sound sources in a classroom: binaural room impulse responses. , 2005, The Journal of the Acoustical Society of America.

[22]  James P. Egan,et al.  Interval of Time Uncertainty in Auditory Detection , 1961 .

[23]  T W Tillman,et al.  Perceptual masking in multiple sound backgrounds. , 1969, The Journal of the Acoustical Society of America.

[24]  B. Shinn-Cunningham,et al.  Selective Attention in Normal and Impaired Hearing , 2008, Trends in amplification.