SPEECH INTELLIGIBILITY, SPATIAL UNMASKING, AND REALISM IN REVERBERANT SPATIAL AUDITORY DISPLAYS

Many auditory displays strive to include accurate directional spatial cues, but few provide robust cues for source distance. This paper considers how including echoes and reverberation in a spatial auditory display (in order t o create salient cues for source distance) impacts other aspects of performance, especially speech intelligibility and spatial unmasking. Preliminary results from masked speech intelligibility studies (together with results from previous experiments investigating sound localization) suggest that including modest amounts of reverberation (such as that present in a typical, everyday room) can provide useful distance information without causing large performance degradations on other tasks.

[1]  S. Carlile,et al.  The generation and validation of high fidelity virtual auditory space , 1998, Proceedings of the 20th Annual International Conference of the IEEE Engineering in Medicine and Biology Society. Vol.20 Biomedical Engineering Towards the Year 2000 and Beyond (Cat. No.98CH36286).

[2]  D S Brungart,et al.  Informational and energetic masking effects in the perception of two simultaneous talkers. , 2001, The Journal of the Acoustical Society of America.

[3]  F L Wightman,et al.  Headphone simulation of free-field listening. II: Psychophysical validation. , 1989, The Journal of the Acoustical Society of America.

[4]  S van de Par,et al.  Dependence of binaural masking level differences on center frequency, masker bandwidth, and interaural parameters. , 1999, The Journal of the Acoustical Society of America.

[5]  Douglas Brungart,et al.  Near-Field Virtual Audio Displays , 2002, Presence: Teleoperators & Virtual Environments.

[6]  R L Freyman,et al.  The role of perceived spatial separation in the unmasking of speech. , 1999, The Journal of the Acoustical Society of America.

[7]  B G Shinn-Cunningham,et al.  Spatial unmasking of nearby speech sources in a simulated anechoic environment. , 2001, The Journal of the Acoustical Society of America.

[8]  Adelbert W. Bronkhorst The cocktail party effect: Research and applications , 1999 .

[9]  D. Mershon,et al.  Intensity and reverberation as factors in the auditory perception of egocentric distance , 1975 .

[10]  G. Kidd,et al.  Evidence for spatial tuning in informational masking using the probe-signal method. , 2000, The Journal of the Acoustical Society of America.

[11]  Barbara Shinn-Cunningham LOCALIZING SOUND IN ROOMS , 2001 .

[12]  E. Langendijk,et al.  Fidelity of three-dimensional-sound reproduction using a virtual auditory display. , 2000, The Journal of the Acoustical Society of America.

[13]  S. Shamma,et al.  Spectro-temporal modulation transfer functions and speech intelligibility. , 1999, The Journal of the Acoustical Society of America.

[14]  Elizabeth M. Wenzel,et al.  Localization in Virtual Acoustic Displays , 1992, Presence: Teleoperators & Virtual Environments.

[15]  R Plomp,et al.  A clinical test for the assessment of binaural speech perception in noise. , 1990, Audiology : official organ of the International Society of Audiology.

[16]  R. Duda,et al.  Range dependence of the response of a spherical head model , 1998 .

[17]  L D Braida,et al.  Intelligibility of conversational and clear speech in noise and reverberation for listeners with normal and impaired hearing. , 1994, The Journal of the Acoustical Society of America.

[18]  A. D. Little,et al.  Effects of Room Reflectance and Background Noise on Perceived Auditory Distance , 1989, Perception.

[19]  T. Houtgast,et al.  A review of the MTF concept in room acoustics and its use for estimating speech intelligibility in auditoria , 1985 .

[20]  W. M. Rabinowitz,et al.  Auditory localization of nearby sources. Head-related transfer functions. , 1999, The Journal of the Acoustical Society of America.

[21]  A. Bronkhorst,et al.  Multichannel speech intelligibility and talker recognition using monaural, binaural, and three-dimensional auditory presentation. , 2000, The Journal of the Acoustical Society of America.

[22]  B.G. Shinn-Cunningham,et al.  Empirical and modeled acoustic transfer functions in a simple room: effects of distance and direction , 2001, Proceedings of the 2001 IEEE Workshop on the Applications of Signal Processing to Audio and Acoustics (Cat. No.01TH8575).

[23]  B. Shinn-Cunningham,et al.  Tori of confusion: binaural localization cues for sources within reach of a listener. , 2000, The Journal of the Acoustical Society of America.

[24]  R Plomp,et al.  The effect of head-induced interaural time and level differences on speech intelligibility in noise. , 1987, The Journal of the Acoustical Society of America.

[25]  A. Bronkhorst Localization of real and virtual sound sources , 1995 .

[26]  H S Colburn,et al.  Reducing informational masking by sound segregation. , 1994, The Journal of the Acoustical Society of America.

[27]  Pavel Zahorik,et al.  Loudness constancy with varying sound source distance , 2001, Nature Neuroscience.

[28]  D H Mershon,et al.  Absolute and Relative Cues for the Auditory Perception of Egocentric Distance , 1979, Perception.

[29]  B. Shinn-Cunningham DISTANCE CUES FOR VIRTUAL AUDITORY SPACE , 2000 .

[30]  Mark R. Anderson,et al.  Direct comparison of the impact of head tracking, reverberation, and individualized head-related transfer functions on the spatial perception of a virtual speech source. , 2001, Journal of the Audio Engineering Society. Audio Engineering Society.

[31]  Durand R. Begault,et al.  Perceptual Effects of Synthetic Reverberation on Three-Dimensional Audio Systems , 1992 .

[32]  Michael D. Good,et al.  Effects of Frequency on Free-Field Masking , 1995, Hum. Factors.

[33]  Steven Greenberg,et al.  The modulation spectrogram: in pursuit of an invariant representation of speech , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[34]  Tammo Houtgast,et al.  Auditory distance perception in rooms , 1999, Nature.

[35]  H Miyata,et al.  残響音場でのダイオティックおよびダイコティック受聴条件下の音声了解度と主観的MTF | 文献情報 | J-GLOBAL 科学技術総合リンクセンター , 1991 .

[36]  T. Houtgast,et al.  Predicting speech intelligibility in rooms from the modulation transfer function, I. General room acoustics , 1980 .

[37]  Steven Greenberg,et al.  The relation between speech intelligibility and the complex modulation spectrum , 2001, INTERSPEECH.

[38]  Barbara G. Shinn-Cunningham Creating three dimensions in virtual auditory displays , 2002 .

[39]  Durand R. Begault,et al.  EARLY REFLECTION THRESHOLDS FOR VIRTUAL SOUND SOURCES , 2001 .

[40]  R H Gilkey,et al.  Effects of masker waveform and signal-to-masker phase relation on diotic and dichotic masking by reproducible noise. , 1985, The Journal of the Acoustical Society of America.

[41]  L D Braida,et al.  A method to determine the speech transmission index from speech waveforms. , 1999, The Journal of the Acoustical Society of America.