Discovering consistent word confusions in noise

Listeners make mistakes when communicating under adverse conditions, with overall error rates reasonably well-predicted by existing speech intelligibility metrics. However, a detailed examination of confusions made by a majority of listeners is more likely to provide insights into processes of normal word recognition. The current study measured the rate at which robust misperceptions occurred for highly-confusable words embedded in noise. In a second experiment, confusions discovered in the first listening test were subjected to a range of manipulations designed to help identify their cause. These experiments reveal that while majority confusions are quite rare, they occur sufficiently often to make large-scale discovery worthwhile. Surprisingly few misperceptions were due solely to energetic masking by the noise, suggesting that speech and noise “react” in complex ways which are not well-described by traditional masking concepts.

[1]  Hideki Kawahara,et al.  Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds , 1999, Speech Commun..

[2]  R. Fay,et al.  Speech Processing in the Auditory System , 2010, Springer Handbook of Auditory Research.

[3]  G. A. Miller,et al.  Erratum: An Analysis of Perceptual Confusions Among Some English Consonants [J. Acoust. Soc. Am. 27, 339 (1955)] , 1955 .

[4]  Peter F. Assmann,et al.  The Perception of Speech Under Adverse Conditions , 2004 .

[5]  Usha Goswami,et al.  Similarity relations among spoken words: The special status of rimes in English , 2002, Behavior research methods, instruments, & computers : a journal of the Psychonomic Society, Inc.

[6]  J. C. Steinberg,et al.  Factors Governing the Intelligibility of Speech Sounds , 1945 .

[7]  K. S. Rhebergen,et al.  Extended speech intelligibility index for the prediction of the speech reception threshold in fluctuating noise. , 2006, The Journal of the Acoustical Society of America.

[8]  Odette Scharenborg,et al.  The interspeech 2008 consonant challenge , 2008, INTERSPEECH.

[9]  Marion S Régnier,et al.  A method to identify noise-robust perceptual features: application for consonant /t/. , 2008, The Journal of the Acoustical Society of America.

[10]  M. Cooke A glimpsing model of speech perception , 2003 .

[11]  B Kollmeier,et al.  Speech intelligibility prediction in hearing-impaired listeners based on a psychoacoustically motivated perception model. , 1996, The Journal of the Acoustical Society of America.

[12]  C J Darwin,et al.  Listening to speech in the presence of other sounds , 2008, Philosophical Transactions of the Royal Society B: Biological Sciences.

[13]  T Houtgast,et al.  A physical method for measuring speech-transmission quality. , 1980, The Journal of the Acoustical Society of America.

[14]  D S Brungart,et al.  Informational and energetic masking effects in the perception of two simultaneous talkers. , 2001, The Journal of the Acoustical Society of America.