The Effect of Situation-Specific Non-Speech Acoustic Cues on the Intelligibility of Speech in Noise

In everyday life, speech is often accompanied by a situation-specific acoustic cue; a hungry bark as you ask ‘Has anyone fed the dog?’. This paper investigates the effect such cues have on speech intelligibility in noise and evaluates their interaction with the established effect of situation-specific semantic cues. This work is motivated by the introduction of new object-based broadcast formats, which have the potential to optimise intelligibility by controlling the level of individual broadcast audio elements, at point of service. Results of this study show that situation-specific acoustic cues alone can improve word recognition in multi-talker babble by 69.5%, a similar amount to semantic cues. The combination of both semantic and acoustic cues provide further improvement of 106.0% compared with no cues, and 18.7% compared with semantic cues only. Interestingly, whilst increasing subjective intelligibility of the target word, the presence of acoustic cues degraded the objective intelligibility of the speech-based semantic cues by 47.0% (equivalent to reducing the speech level by 4.5 dB). This paper discusses the interactions between the two types of cues and the implications that these results have for assessing and improving the intelligibility of broadcast speech.

[1]  Nancy Tye-Murray,et al.  Effects of Context Type on Lipreading and Listening Performance and Implications for Sentence Processing. , 2015, Journal of speech, language, and hearing research : JSLHR.

[2]  L L Elliott,et al.  Development of a test of speech intelligibility in noise using sentence materials with controlled word predictability. , 1977, The Journal of the Acoustical Society of America.

[3]  Martin Cooke,et al.  A glimpsing model of speech perception in noise. , 2006, The Journal of the Acoustical Society of America.

[4]  Methods for the subjective assessment of small impairments in audio systems , 2015 .

[5]  Ferath Kherif,et al.  Does Semantic Context Benefit Speech Understanding through “Top–Down” Processes? Evidence from Time-resolved Sparse fMRI , 2011, Journal of Cognitive Neuroscience.

[6]  L E Humes,et al.  Factors associated with individual differences in clinical measures of speech recognition among the elderly. , 1994, Journal of speech and hearing research.

[7]  Benjamin Guy Shirley,et al.  Improving television sound for people with hearing impairments , 2013 .

[8]  Nao Hodoshima Effects of Urgent Speech and Preceding Sounds on Speech Intelligibility in Noisy and Reverberant Environments , 2016, INTERSPEECH.

[9]  William J. Davies,et al.  Turning up the background noise; The effects of salient non-speech audio elements on dialogue intelligibility in complex acoustic scenes , 2017 .

[10]  Y. Cohen,et al.  The what, where and how of auditory-object perception , 2013, Nature Reviews Neuroscience.

[11]  D J Schum,et al.  SPIN test performance of elderly hearing-impaired listeners. , 1992, Journal of the American Academy of Audiology.

[12]  Yan Tang,et al.  Subjective and Objective Evaluation of Speech Intelligibility Enhancement Under Constant Energy and Duration Constraints , 2011, INTERSPEECH.

[13]  Virginie Laval,et al.  Understanding expressive speech acts: the role of prosody and situational context in French-speaking 5- to 9-year-olds. , 2010, Journal of speech, language, and hearing research : JSLHR.

[14]  GUIDELINES FOR PRODUCTION OF PROGRAMMES IN ACCORDANCE WITH EBU R 128 , 2016 .

[15]  Jerker Rönnberg,et al.  The Influence of Semantically Related and Unrelated Text Cues on the Intelligibility of Sentences in Noise , 2011, Ear and hearing.

[16]  Elizabeth Bates,et al.  Effects of acoustic distortion and semantic context on lexical access , 2004 .

[17]  Tessa Bent,et al.  Children's Use of Semantic Context in Perception of Foreign-Accented Speech. , 2017, Journal of speech, language, and hearing research : JSLHR.

[18]  Kelly L Watts,et al.  The Revised Speech Perception in Noise Test (R-SPIN) in a multiple signal-to-noise ratio paradigm. , 2012, Journal of the American Academy of Audiology.

[19]  Harald Fuchs,et al.  Recent advances in broadcast audio coding , 2013, 2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB).

[20]  R. Mayer,et al.  A coherence effect in multimedia learning: The case for minimizing irrelevant sounds in the design of multimedia instructional messages. , 2000 .

[21]  S. David,et al.  Auditory attention : focusing the searchlight on sound , 2007 .

[22]  James Woodcock,et al.  Personalized Object-Based Audio for Hearing Impaired TV Viewers , 2017 .

[23]  Jon Barker,et al.  Modelling speaker intelligibility in noise , 2007, Speech Commun..

[24]  Algorithms to measure audio programme loudness and true-peak audio level , 2011 .

[25]  M. Daneman,et al.  How young and old adults listen to and remember speech in noise. , 1995, The Journal of the Acoustical Society of America.

[26]  Signy Sheldon,et al.  Priming and sentence context support listening to noise-vocoded speech by younger and older adults. , 2008, The Journal of the Acoustical Society of America.

[27]  Stuart Rosen,et al.  Native-language benefit for understanding speech-in-noise: The contribution of semantics* , 2009, Bilingualism: Language and Cognition.

[28]  Mariana Lopez Perceptual Evaluation of an Audio Film for Visually Impaired Audiences , 2015 .