A frequency-selective feedback model of auditory efferent suppression and its implications for the recognition of speech in noise.

The potential contribution of the peripheral auditory efferent system to our understanding of speech in a background of competing noise was studied using a computer model of the auditory periphery and assessed using an automatic speech recognition system. A previous study had shown that a fixed efferent attenuation applied to all channels of a multi-channel model could improve the recognition of connected digit triplets in noise [G. J. Brown, R. T. Ferry, and R. Meddis, J. Acoust. Soc. Am. 127, 943-954 (2010)]. In the current study an anatomically justified feedback loop was used to automatically regulate separate attenuation values for each auditory channel. This arrangement resulted in a further enhancement of speech recognition over fixed-attenuation conditions. Comparisons between multi-talker babble and pink noise interference conditions suggest that the benefit originates from the model's ability to modify the amount of suppression in each channel separately according to the spectral shape of the interfering sounds.

[1]  Guy J. Brown,et al.  The Representation of Speech in a Nonlinear Auditory Model: Time-Domain Analysis of Simulated Auditory-Nerve Firing Patterns , 2011, INTERSPEECH.

[2]  M. C. Brown,et al.  Physiology and anatomy of single olivocochlear neurons in the cat , 1986, Hearing Research.

[3]  Ray Meddis,et al.  A nonlinear filter-bank model of the guinea-pig cochlear nerve: rate responses. , 2003, The Journal of the Acoustical Society of America.

[4]  Werner Hemmert,et al.  Speech encoding in a model of peripheral auditory processing: Quantitative assessment by means of automatic speech recognition , 2007, Speech Commun..

[5]  Oded Ghitza,et al.  An Efferent-Inspired Auditory Model Front-End for Speech Recognition , 2011, INTERSPEECH.

[6]  Ray Meddis,et al.  A computer model of medial efferent suppression in the mammalian auditory system. , 2007, The Journal of the Acoustical Society of America.

[7]  M. Liberman,et al.  Response properties of cochlear efferent neurons: monaural vs. binaural stimulation and the effects of noise. , 1988, Journal of neurophysiology.

[8]  S. Boll,et al.  Suppression of acoustic noise in speech using spectral subtraction , 1979 .

[9]  I. Russell,et al.  Medial efferent inhibition suppresses basilar membrane responses to near characteristic frequency tones of moderate to high intensities. , 1997, The Journal of the Acoustical Society of America.

[10]  Chia-ying Lee Closed-loop Auditory-based Representation for Robust Speech Recognition , 2010 .

[11]  Rhee Man Kil,et al.  Auditory processing of speech signals for robust speech recognition in real-world noisy environments , 1999, IEEE Trans. Speech Audio Process..

[12]  E. Lopez-Poveda,et al.  A computational algorithm for computing nonlinear auditory frequency selectivity. , 2001, The Journal of the Acoustical Society of America.

[13]  Oded Ghitza,et al.  A non-linear efferent-inspired model of the auditory system; matching human confusions in stationary noise , 2009, Speech Commun..

[14]  E. Lopez-Poveda,et al.  A human nonlinear cochlear filterbank. , 2001, The Journal of the Acoustical Society of America.

[15]  John J Guinan,et al.  Cochlear efferent innervation and function , 2010, Current opinion in otolaryngology & head and neck surgery.

[16]  Paul A Fuchs,et al.  Short-Term Synaptic Plasticity Regulates the Level of Olivocochlear Inhibition to Auditory Hair Cells , 2011, The Journal of Neuroscience.

[17]  Guy J. Brown,et al.  A speech-in-noise test based on spoken digits: comparison of normal and impaired listeners using a computer model , 2010, INTERSPEECH.

[18]  John J Guinan,et al.  Effects of electrical stimulation of efferent olivocochlear neurons on cat auditory-nerve fibers. III. Tuning curves and thresholds at CF , 1988, Hearing Research.

[19]  Richard Lippmann,et al.  A comparison of signal processing front ends for automatic word recognition , 1995, IEEE Trans. Speech Audio Process..

[20]  J. Guinan Olivocochlear Efferents: Anatomy, Physiology, Function, and the Measurement of Efferent Effects in Humans , 2006, Ear and hearing.

[21]  Ray Meddis,et al.  Auditory-nerve first-spike latency and auditory absolute threshold: a computer model. , 2006, The Journal of the Acoustical Society of America.

[22]  B. Delgutte,et al.  Speech coding in the auditory nerve: V. Vowels in background noise. , 1984, The Journal of the Acoustical Society of America.

[23]  Julius L. Goldstein,et al.  Modeling rapid waveform compression on the basilar membrane as multiple-bandpass-nonlinearity filtering , 1990, Hearing Research.

[24]  Murray B. Sachs Adequacy of auditory-nerve rate representations of vowels: Comparison with behavioral measures in cat , 2012 .

[25]  Ray Meddis,et al.  A revised model of the inner-hair cell and auditory-nerve complex. , 2002, The Journal of the Acoustical Society of America.

[26]  Watjana Lilaonitkul,et al.  Reflex control of the human inner ear: a half-octave offset in medial efferent feedback that is consistent with an efferent role in the control of masking. , 2009, Journal of neurophysiology.

[27]  J. Guinan,et al.  Time-course of the human medial olivocochlear reflex. , 2006, The Journal of the Acoustical Society of America.

[28]  Oded Ghitza,et al.  Consonant discrimination of degraded speech using an efferent-inspired closed-loop cochlear model , 2008, Interspeech.

[29]  C D Geisler,et al.  Responses of "high-spontaneous" auditory-nerve fibers to consonant-vowel syllables in noise. , 1989, The Journal of the Acoustical Society of America.

[30]  Ugo Fisch,et al.  Intraoperative Assessment of Stapes Movement , 2001, The Annals of otology, rhinology, and laryngology.

[31]  Ray Meddis,et al.  A computer model of auditory efferent suppression: implications for the recognition of speech in noise. , 2010, The Journal of the Acoustical Society of America.

[32]  Hamid Sheikhzadeh,et al.  Speech analysis and recognition using interval statistics generated from a composite auditory model , 1998, IEEE Trans. Speech Audio Process..

[33]  R. G. Leonard,et al.  A database for speaker-independent digit recognition , 1984, ICASSP.

[34]  J. Guinan,et al.  Separate mechanical processes underlie fast and slow effects of medial olivocochlear efferent activity , 2003, The Journal of physiology.

[35]  M. C. Brown Morphology and response properties of single olivocochlear fibers in the guinea pig , 1989, Hearing Research.

[36]  Ray Meddis,et al.  Adaptation in a revised inner-hair cell model. , 2003, The Journal of the Acoustical Society of America.

[37]  Raimond L. Winslow,et al.  Some Aspects of Rate Coding in the Auditory Nerve , 1986 .