Informational masking of speech by acoustically similar intelligible and unintelligible interferers.

Masking experienced when target speech is accompanied by a single interfering voice is often primarily informational masking (IM). IM is generally greater when the interferer is intelligible than when it is not (e.g., speech from an unfamiliar language), but the relative contributions of acoustic-phonetic and linguistic interference are often difficult to assess owing to acoustic differences between interferers (e.g., different talkers). Three-formant analogues (F1+F2+F3) of natural sentences were used as targets and interferers. Targets were presented monaurally either alone or accompanied contralaterally by interferers from another sentence (F0 = 4 semitones higher); a target-to-masker ratio (TMR) between ears of 0, 6, or 12 dB was used. Interferers were either intelligible or rendered unintelligible by delaying F2 and advancing F3 by 150 ms relative to F1, a manipulation designed to minimize spectro-temporal differences between corresponding interferers. Target-sentence intelligibility (keywords correct) was 67% when presented alone, but fell considerably when an unintelligible interferer was present (49%) and significantly further when the interferer was intelligible (41%). Changes in TMR produced neither a significant main effect nor an interaction with interferer type. Interference with acoustic-phonetic processing of the target can explain much of the impact on intelligibility, but linguistic factors-particularly interferer intrusions-also make an important contribution to IM.

[1]  Wouter A Dreschler,et al.  Release from informational masking by time reversal of native and non-native interfering speech. , 2005, The Journal of the Acoustical Society of America.

[2]  R. J. Summers,et al.  Informational masking of monaural target speech by a single contralateral formant. , 2015, The Journal of the Acoustical Society of America.

[3]  DeLiang Wang,et al.  Isolating the energetic component of speech-on-speech masking with ideal time-frequency segregation. , 2006, The Journal of the Acoustical Society of America.

[4]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[5]  Gerald Kidd,et al.  Tuning in the spatial dimension: evidence from a masked speech identification task. , 2008, The Journal of the Acoustical Society of America.

[6]  Matthew H. Davis,et al.  Lexical information drives perceptual learning of distorted speech: evidence from the comprehension of noise-vocoded sentences. , 2005, Journal of experimental psychology. General.

[7]  Robert E Remez,et al.  Estimating speech spectra for copy synthesis by linear prediction and by hand. , 2011, The Journal of the Acoustical Society of America.

[8]  J. P. Morgan,et al.  Design and Analysis: A Researcher's Handbook , 2005, Technometrics.

[9]  B. Shinn-Cunningham Object-based auditory and visual attention , 2008, Trends in Cognitive Sciences.

[10]  Simon Carlile,et al.  Selective spatial attention modulates bottom-up informational masking of speech , 2015, Scientific Reports.

[11]  Virginia Best,et al.  Stimulus factors influencing spatial release from speech-on-speech masking. , 2010, The Journal of the Acoustical Society of America.

[12]  R. J. Summers,et al.  Informational masking of speech by time-varying competitors: Effects of frequency region and number of interfering formants. , 2018, The Journal of the Acoustical Society of America.

[13]  Lauren Calandruccio,et al.  Linguistic contributions to speech-on-speech masking for native and non-native listeners: language familiarity and semantic content. , 2012, Journal of the Acoustical Society of America.

[14]  Douglas S Brungart,et al.  Within-ear and across-ear interference in a cocktail-party listening task. , 2002, The Journal of the Acoustical Society of America.

[15]  Douglas S Brungart,et al.  Effect of target-masker similarity on across-ear interference in a dichotic cocktail-party listening task. , 2007, The Journal of the Acoustical Society of America.

[16]  Dylan M. Jones,et al.  Irrelevant tones produce an irrelevant speech effect : Implications for phonological coding in working memory , 1993 .

[17]  R. J. Summers,et al.  Formant-Frequency Variation and Informational Masking of Speech by Extraneous Formants: Evidence Against Dynamic and Speech-Specific Acoustical Constraints , 2014, Journal of experimental psychology. Human perception and performance.

[18]  Per B. Brockhoff,et al.  lmerTest Package: Tests in Linear Mixed Effects Models , 2017 .

[19]  M. Ericson,et al.  Informational and energetic masking effects in the perception of multiple simultaneous talkers. , 2001, The Journal of the Acoustical Society of America.

[20]  Virginia Best,et al.  The role of syntax in maintaining the integrity of streams of speech. , 2014, The Journal of the Acoustical Society of America.

[21]  Paul Boersma,et al.  Praat, a system for doing phonetics by computer , 2002 .

[22]  Gerald Kidd,et al.  Linguistically-based informational masking in preschool children. , 2015, The Journal of the Acoustical Society of America.

[23]  Lauren Calandruccio,et al.  Does the semantic content or syntactic regularity of masker speech affect speech-on-speech recognition? , 2018, The Journal of the Acoustical Society of America.

[24]  Kristin J. Van Engen,et al.  Sentence recognition in native- and foreign-language multi-talker background noise. , 2007, The Journal of the Acoustical Society of America.

[25]  J Bamford,et al.  The BKB (Bamford-Kowal-Bench) sentence lists for partially-hearing children. , 1979, British journal of audiology.

[26]  Lauren Calandruccio,et al.  Masking release due to linguistic and phonetic dissimilarity between the target and masker speech. , 2013, American journal of audiology.

[27]  E. C. Cmm,et al.  on the Recognition of Speech, with , 2008 .

[28]  A. Bregman,et al.  Demonstrations of auditory scene analysis : the perceptual organization of sound , 1995 .

[29]  R. J. Summers,et al.  Dichotic integration of acoustic-phonetic information: Competition from extraneous formants increases the effect of second-formant attenuation on intelligibility. , 2019, The Journal of the Acoustical Society of America.

[30]  Nandini Iyer,et al.  Effects of target-masker contextual similarity on the multimasker penalty in a three-talker diotic listening task. , 2010, The Journal of the Acoustical Society of America.

[31]  Dennis H. Klatt,et al.  Software for a cascade/parallel formant synthesizer , 1980 .

[32]  C J Darwin,et al.  Listening to speech in the presence of other sounds , 2008, Philosophical Transactions of the Royal Society B: Biological Sciences.

[33]  A. Rosenberg Effect of glottal pulse shape on the quality of natural vowels. , 1969, The Journal of the Acoustical Society of America.

[34]  Brian Roberts,et al.  The perceptual organization of sine-wave speech under competitive conditions. , 2009, The Journal of the Acoustical Society of America.

[35]  Navin Viswanathan,et al.  The role of speech fidelity in the irrelevant sound effect: Insights from noise-vocoded speech backgrounds , 2018, Quarterly journal of experimental psychology.

[36]  Lauren Calandruccio,et al.  Speech-on-speech masking with variable access to the linguistic content of the masker speech. , 2010, The Journal of the Acoustical Society of America.

[37]  David E Kieras,et al.  Enhancing listener strategies using a payoff matrix in speech-on-speech masking experiments. , 2015, The Journal of the Acoustical Society of America.

[38]  R. J. Summers,et al.  Across-formant integration and speech intelligibility: Effects of acoustic source properties in the presence and absence of a contralateral interferer. , 2015, The Journal of the Acoustical Society of America.

[39]  Xihong Wu,et al.  Improvement of intelligibility of ideal binary-masked noisy speech by adding background noise. , 2011, The Journal of the Acoustical Society of America.

[40]  S. Rosen Temporal information in speech: acoustic, auditory and linguistic aspects. , 1992, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[41]  Fan-Gang Zeng,et al.  Speech recognition with varying numbers and types of competing talkers by normal-hearing, cochlear-implant, and implant simulation subjects. , 2008, The Journal of the Acoustical Society of America.

[42]  Steven G. Luke,et al.  Evaluating significance in linear mixed-effects models in R , 2016, Behavior Research Methods.

[43]  G. Kidd,et al.  The effect of spatial separation on informational and energetic masking of speech. , 2002, The Journal of the Acoustical Society of America.

[44]  Daria F. Ferro,et al.  Asynchrony tolerance in the perceptual organization of speech , 2008, Psychonomic bulletin & review.

[45]  Jayaganesh Swaminathan,et al.  Determining the energetic and informational components of speech-on-speech masking , 2016, The Journal of the Acoustical Society of America.

[46]  G. Studebaker A "rationalized" arcsine transform. , 1985, Journal of speech and hearing research.

[47]  Steven Greenberg,et al.  Speech intelligibility in the presence of cross-channel spectral asynchrony , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[48]  IEEE Recommended Practice for Speech Quality Measurements , 1969, IEEE Transactions on Audio and Electroacoustics.

[49]  Frederick J. Gallun,et al.  The ability to listen with independent ears. , 2007, The Journal of the Acoustical Society of America.

[50]  James M. McQueen,et al.  Pure linguistic interference during comprehension of competing speech signals. , 2017, The Journal of the Acoustical Society of America.

[51]  R L Freyman,et al.  Spatial release from informational masking in speech recognition. , 2001, The Journal of the Acoustical Society of America.