Comparison of fluctuating maskers for speech recognition tests

Abstract Objective: To investigate the extent to which temporal gaps, temporal fine structure, and comprehensibility of the masker affect masking strength in speech recognition experiments. Design: Seven different masker types with Dutch speech materials were evaluated. Amongst these maskers were the ICRA-5 fluctuating noise, the international speech test signal (ISTS), and competing talkers in Dutch and Swedish. Study Sample: Normal-hearing and hearing-impaired subjects. Results: The normal-hearing subjects benefited from both temporal gaps and temporal fine structure in the fluctuating maskers. When the competing talker was comprehensible, performance decreased. The ISTS masker appeared to cause a large informational masking component. The stationary maskers yielded the steepest slopes of the psychometric function, followed by the modulated noises, followed by the competing talkers. Although the hearing-impaired group was heterogeneous, their data showed similar tendencies, but sometimes to a lesser extent, depending on individuals’ hearing impairment. Conclusions: If measurement time is of primary concern non-modulated maskers are advised. If it is useful to assess release of masking by the use of temporal gaps, a fluctuating noise is advised. If perception of temporal fine structure is being investigated, a foreign-language competing talker is advised. Sumario Objetivo: Investigar hasta dónde las brechas temporales y las finas estructuras temporales afectan la potencia del enmascaramiento en los experimentos de reconocimiento del lenguaje. Diseño: Se evaluaron siete tipos diferentes de enmascaradores con materiales de lenguaje en Holandés. Entre estos enmascaradores estuvieron el ICRA-5 de ruido fluctuante, la señal internacional de prueba de lenguaje (ISTS) y los materiales competitivos en Holandés y Sueco. Muestra De Estudio: Sujetos normoyentes y con pérdidas auditivas. Resultados: Los normoyentes se beneficiaron tanto de las brechas temporales como de las finas estructuras temporales, con los enmascaradores fluctuantes. Cuando fue comprensible el mensaje hablado competitivo, el rendimiento disminuyó. El enmascarador ISTS apareció como causante de un más amplio componente de enmascaramiento informacional. Los enmascaradores estacionarios produjeron los más bruscos gradientes de la función psicométrica, seguidos por los ruidos modulados y después por los mensajes hablados competitivos. A pesar de que el grupo de hipoacúsicos fue heterogéneo, sus datos mostraron tendencias similares pero algunas veces en un menor grado, dependiendo de los impedimentos auditivos individuales. Conclusiones: Si la medición del tiempo es la preocupación primordial, se aconseja el uso de enmascaradores modulados. Si lo útil es evaluar la liberación del enmascaramiento por el uso de brechas temporales, se aconseja el uso de ruido fluctuante. Si lo que se investiga es la percepción de estructuras temporales finas, se aconseja el uso de mensajes hablados competitivos en lengua extranjera.

[1]  H. Dillon,et al.  An international comparison of long‐term average speech spectra , 1994 .

[2]  Jan Wouters,et al.  APEX 3: a multi-purpose test platform for auditory psychophysical experiments , 2008, Journal of Neuroscience Methods.

[3]  Wouter A Dreschler,et al.  Release from informational masking by time reversal of native and non-native interfering speech. , 2005, The Journal of the Acoustical Society of America.

[4]  A. M. Mimpen,et al.  Improving the reliability of testing the speech reception threshold for sentences. , 1979, Audiology : official organ of the International Society of Audiology.

[5]  S P Bacon,et al.  Modulation detection, modulation masking, and speech understanding in noise in the elderly. , 1992, Journal of speech and hearing research.

[6]  E. Vajda Handbook of the International Phonetic Association: A Guide to the Use of the International Phonetic Alphabet , 2000 .

[7]  Birger Kollmeier,et al.  The role of silent intervals for sentence intelligibility in fluctuating noise in hearing-impaired listeners , 2006, International journal of audiology.

[8]  Fan-Gang Zeng,et al.  Speech recognition with varying numbers and types of competing talkers by normal-hearing, cochlear-implant, and implant simulation subjects. , 2008, The Journal of the Acoustical Society of America.

[9]  M. Ericson,et al.  Informational and energetic masking effects in the perception of multiple simultaneous talkers. , 2001, The Journal of the Acoustical Society of America.

[10]  Wouter A. Dreschler,et al.  ICRA Noises: Artificial Noise Signals with Speech-like Spectral and Temporal Properties for Hearing Instrument Assessment: Ruidos ICRA: Señates de ruido artificial con espectro similar al habla y propiedades temporales para pruebas de instrumentos auditivos , 2001 .

[11]  B J Kwon,et al.  Consonant identification under maskers with sinusoidal modulation: masking release or modulation interference? , 2001, The Journal of the Acoustical Society of America.

[12]  Rob Drullman,et al.  Speech perception and talker segregation: effects of level, pitch, and tactile support with multiple simultaneous talkers. , 2004, The Journal of the Acoustical Society of America.

[13]  B C Moore,et al.  Speech reception thresholds in noise with and without spectral and temporal dips for hearing-impaired and normally hearing people. , 1998, The Journal of the Acoustical Society of America.

[14]  H. Gustafsson,et al.  Masking of speech by amplitude-modulated noise. , 1994, The Journal of the Acoustical Society of America.

[15]  Astrid van Wieringen,et al.  LIST and LINT: Sentences and numbers for quantifying speech understanding in severely impaired listeners for Flanders and the Netherlands , 2008, International journal of audiology.

[16]  S. Rosen,et al.  Uncomodulated glimpsing in "checkerboard" noise. , 1993, The Journal of the Acoustical Society of America.

[17]  DeLiang Wang,et al.  Multitalker speech perception with ideal time-frequency segregation: effects of voice characteristics and number of talkers. , 2009, The Journal of the Acoustical Society of America.

[18]  B. Shinn-Cunningham Object-based auditory and visual attention , 2008, Trends in Cognitive Sciences.

[19]  Joshua G. W. Bernstein,et al.  Auditory and auditory-visual intelligibility of speech in fluctuating maskers for normal-hearing and hearing-impaired listeners. , 2009, The Journal of the Acoustical Society of America.

[20]  R. Plomp,et al.  Effects of fluctuating noise and interfering speech on the speech-reception threshold for impaired and normal hearing. , 1990, The Journal of the Acoustical Society of America.

[21]  D D Dirks,et al.  Speech recognition in amplitude-modulated noise of listeners with normal and listeners with impaired hearing. , 1995, Journal of speech and hearing research.

[22]  B. Shinn-Cunningham,et al.  Note on informational masking (L) , 2003 .

[23]  W. Dreschler,et al.  ICRA noises: artificial noise signals with speech-like spectral and temporal properties for hearing instrument assessment. International Collegium for Rehabilitative Audiology. , 2001, Audiology : official organ of the International Society of Audiology.

[24]  T Houtgast,et al.  Method for the selection of sentence materials for efficient measurement of the speech reception threshold. , 1999, The Journal of the Acoustical Society of America.

[25]  S. Bacon,et al.  The effects of hearing loss and noise masking on the masking release for speech in temporally complex backgrounds. , 1998, Journal of speech, language, and hearing research : JSLHR.

[26]  B. Shinn-Cunningham,et al.  Note on informational masking. , 2003, The Journal of the Acoustical Society of America.

[27]  D S Brungart,et al.  Informational and energetic masking effects in the perception of two simultaneous talkers. , 2001, The Journal of the Acoustical Society of America.

[28]  M. Cooke,et al.  Consonant identification in N-talker babble is a nonmonotonic function of N. , 2005, The Journal of the Acoustical Society of America.

[29]  Paul Boersma,et al.  Praat, a system for doing phonetics by computer , 2002 .