Contributions of talker characteristics and spatial location to auditory streaming.

To examine whether auditory streaming contributes to unmasking, intelligibility of target sentences against two competing talkers was measured using the coordinate response measure (CRM) [Bolia et al., J. Acoust. Soc. Am. 107, 1065-1066 (2007)] corpus. In the control condition, the speech reception threshold (50% correct) was measured when the target and two maskers were collocated straight ahead. Separating maskers from the target by +/-30 degrees resulted in spatial release from masking of 12 dB. CRM sentences involve an identifier in the first part and two target words in the second part. In experimental conditions, masking talkers started spatially separated at +/-30 degrees but became collocated with the target before the scoring words. In one experiment, one target and two different maskers were randomly selected from a mixed-sex corpus. Significant unmasking of 4 dB remained despite the absence of persistent location cues. When same-sex talkers were used as maskers and target, unmasking was reduced. These data suggest that initial separation may permit confident identification and streaming of the target and masker speech where significant differences between target and masker voice characteristics exist, but where target and masker characteristics are similar, listeners must rely more heavily on continuing spatial cues.

[1]  R. Plomp,et al.  Effects of fluctuating noise and interfering speech on the speech-reception threshold for impaired and normal hearing. , 1990, The Journal of the Acoustical Society of America.

[2]  Ruth Y Litovsky,et al.  Speech intelligibility and spatial release from masking in young children. , 2005, The Journal of the Acoustical Society of America.

[3]  C. Benoît,et al.  Effects of phonetic context on audio-visual intelligibility of French. , 1994, Journal of speech and hearing research.

[4]  R Plomp,et al.  The effect of head-induced interaural time and level differences on speech intelligibility in noise. , 1987, The Journal of the Acoustical Society of America.

[5]  Douglas S Brungart,et al.  The effects of spatial separation in distance on the informational and energetic masking of a nearby speech signal. , 2002, The Journal of the Acoustical Society of America.

[6]  Kazuo Ikeda,et al.  The effect of exogenous spatial attention on auditory information processing , 2007, Psychological research.

[7]  B. Shinn-Cunningham,et al.  Influences of spatial cues on grouping and understanding sound , 2005 .

[8]  C. Darwin,et al.  Effects of fundamental frequency and vocal-tract length changes on attention to one of two simultaneous talkers. , 2003, The Journal of the Acoustical Society of America.

[9]  Jeffery A. Jones,et al.  Visual Prosody and Speech Intelligibility , 2004, Psychological science.

[10]  Ruth Y Litovsky,et al.  The benefit of binaural hearing in a cocktail party: effect of location and type of interferer. , 2004, The Journal of the Acoustical Society of America.

[11]  R. W. Hukin,et al.  Auditory objects of attention: the role of interaural time differences. , 1999, Journal of experimental psychology. Human perception and performance.

[12]  R. Zatorre,et al.  Constraints on the selection of auditory information. , 1998 .

[13]  A. Boothroyd,et al.  Mathematical treatment of context effects in phoneme and word recognition. , 1988, The Journal of the Acoustical Society of America.

[14]  Barbara G Shinn-Cunningham,et al.  Localizing nearby sound sources in a classroom: binaural room impulse responses. , 2005, The Journal of the Acoustical Society of America.

[15]  J. Culling,et al.  Perceptual separation of concurrent speech sounds: absence of across-frequency grouping by common interaural delay. , 1995, The Journal of the Acoustical Society of America.

[16]  Barbara G. Shinn-Cunningham,et al.  Selective and Divided Attention: Extracting Information from Simultaneous Sound Sources , 2004, ICAD.

[17]  R L Freyman,et al.  Spatial release from informational masking in speech recognition. , 2001, The Journal of the Acoustical Society of America.

[18]  G. Kidd,et al.  Evidence for spatial tuning in informational masking using the probe-signal method. , 2000, The Journal of the Acoustical Society of America.

[19]  N. Schiavetti,et al.  Contextual effects in the measurement of hearing-impaired speakers' intelligibility. , 1983, Journal of speech and hearing research.

[20]  DeLiang Wang,et al.  Isolating the energetic component of speech-on-speech masking with ideal time-frequency segregation. , 2006, The Journal of the Acoustical Society of America.

[21]  C. Mason,et al.  Release from masking due to spatial separation of sources in the identification of nonspeech auditory patterns. , 1998, The Journal of the Acoustical Society of America.

[22]  R. Freyman,et al.  The role of visual speech cues in reducing energetic and informational masking. , 2005, The Journal of the Acoustical Society of America.

[23]  John F Culling,et al.  The spatial unmasking of speech: evidence for within-channel processing of interaural time delay. , 2005, The Journal of the Acoustical Society of America.

[24]  E. Lie,et al.  The effect of gaze direction on sound localization in brain-injured and normal adults , 2005, Experimental Brain Research.

[25]  Frederick J. Gallun,et al.  The advantage of knowing where to listen. , 2005, The Journal of the Acoustical Society of America.

[26]  M. Ericson,et al.  Informational and energetic masking effects in the perception of multiple simultaneous talkers. , 2001, The Journal of the Acoustical Society of America.

[27]  Sue Harding,et al.  A case for multi-resolution auditory scene analysis , 2001, INTERSPEECH.

[28]  W. Noble,et al.  Auditory localization, detection of spatial separateness, and speech hearing in noise by hearing impaired listeners. , 1997, The Journal of the Acoustical Society of America.

[29]  D S Brungart,et al.  Informational and energetic masking effects in the perception of two simultaneous talkers. , 2001, The Journal of the Acoustical Society of America.

[30]  K. Grant,et al.  The effect of speechreading on masked detection thresholds for filtered speech. , 2001, The Journal of the Acoustical Society of America.

[31]  Simon Carlile,et al.  Synchronizing to real events: subjective audiovisual alignment scales with perceived auditory depth and speed of sound. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[32]  Douglas Brungart,et al.  Informational masking of speech in children: auditory-visual integration. , 2006, The Journal of the Acoustical Society of America.

[33]  J. Schwartz,et al.  Seeing to hear better: evidence for early audio-visual interactions in speech identification , 2004, Cognition.

[34]  Richard L Freyman,et al.  Effect of number of masking talkers and auditory priming on informational masking in speech recognition. , 2004, The Journal of the Acoustical Society of America.

[35]  W. Yost Auditory image perception and analysis: The basis for hearing , 1991, Hearing Research.

[36]  W. T. Nelson,et al.  A speech corpus for multitalker communications research. , 2000, The Journal of the Acoustical Society of America.

[37]  P F Seitz,et al.  The use of visible speech cues for improving auditory detection of spoken sentences. , 2000, The Journal of the Acoustical Society of America.

[38]  A. Watson Probability summation over time , 1979, Vision Research.

[39]  William Noble,et al.  Hearing speech against spatially separate competing speech versus competing noise , 2002, Perception & psychophysics.

[40]  R. W. Hukin,et al.  Effectiveness of spatial cues, prosody, and talker characteristics in selective attention. , 2000, The Journal of the Acoustical Society of America.

[41]  Terence W. Picton,et al.  Effects of Attention on Neuroelectric Correlates of Auditory Stream Segregation , 2006, Journal of Cognitive Neuroscience.

[42]  Mark A. Ericson,et al.  Factors That Influence Intelligibility in Multitalker Speech Displays , 2004 .

[43]  H S Colburn,et al.  Reducing informational masking by sound segregation. , 1994, The Journal of the Acoustical Society of America.

[44]  D. Burr,et al.  The Ventriloquist Effect Results from Near-Optimal Bimodal Integration , 2004, Current Biology.

[45]  E. C. Cmm,et al.  on the Recognition of Speech, with , 2008 .

[46]  R L Freyman,et al.  The role of perceived spatial separation in the unmasking of speech. , 1999, The Journal of the Acoustical Society of America.