The acoustic features of speech sounds in a model of auditory processing: vowels and voiceless fricatives

The acoustic features of three classes of complex sounds (complex tones, vowels and voiceless fricatives) were analyzed using a model of auditory signal processing. The model consists of a peripheral cochlear component followed by two central neural networks. At the peripheral stage the asymmetrical shape of the cochlear filters, in combination with the preservation of the fine-temporal structure of their outputs, provide for a robust spatio-temporal representation of speech sounds. The cochlear patterns are subsequently processed by two separate layers of lateral inhibitory networks (LINs) in order to extract perceptually significant features of the input signal. For speech-like signals the LIN output emphasizes the spectral components in the region of the formant peaks. The LIN patterns generated in response to vowels spoken by male and female speakers contain some variability, particularly with respect to the location of formant peaks. However, the relative amplitudes of the LIN peaks (or, more precisely, the weight distribution of the LIN patterns) provide a more stable representation of each of the major vocalic classes. With respect to the voiceless fricatives, the model suggests that the most distinctive acoustic feature is the location of the high-frequency edge of the signal spectrum.

[1]  Steven Greenberg,et al.  A Composite Model of the Auditory Periphery for the Processing of Speech (Invited) , 1988 .

[2]  M. Sachs,et al.  Representation of steady-state vowels in the temporal aspects of the discharge patterns of populations of auditory-nerve fibers. , 1979, The Journal of the Acoustical Society of America.

[3]  S. Shamma Speech processing in the auditory system. I: The representation of speech sounds in the responses of the auditory nerve. , 1985, The Journal of the Acoustical Society of America.

[4]  Floyd Ratliff,et al.  Studies on Excitation and Inhibition in the Retina , 1975 .

[5]  J. Rinzel,et al.  A biophysical model of cochlear processing: intensity dependence of pure tone responses. , 1986, The Journal of the Acoustical Society of America.

[6]  G. Kuhn On the front cavity resonance and its possible role in speech perception. , 1975, The Journal of the Acoustical Society of America.

[7]  S. A. Shamma The Auditory Processing of Speech. , 1986 .

[8]  Shihab A. Shamma Encoding the Acoustic Spectrum in the Spatio-Temporal Responses of the Auditory-Nerve , 1986 .

[9]  L. A. Westerman,et al.  Rapid and short-term adaptation in auditory nerve responses , 1984, Hearing Research.

[10]  W. S. Rhode Observations of the vibration of the basilar membrane in squirrel monkeys using the Mössbauer technique. , 1971, The Journal of the Acoustical Society of America.

[11]  M. Sachs,et al.  Encoding of steady-state vowels in the auditory nerve: representation in terms of discharge rate. , 1979, The Journal of the Acoustical Society of America.

[12]  Steven Greenberg,et al.  Neural temporal coding of low pitch. I. Human frequency-following responses to complex tones , 1987, Hearing Research.

[13]  J. Cole,et al.  Cochlear mechanics: analysis for a pure tone. , 1983, The Journal of the Acoustical Society of America.

[14]  Stephanie Seneff,et al.  Pitch and spectral analysis of speech based on an auditory synchrony model , 1985 .

[15]  M. Sachs,et al.  Representation of stop consonants in the discharge patterns of auditory-nerve fibers. , 1983, The Journal of the Acoustical Society of America.

[16]  Hynek Hermansky,et al.  Perceptually based processing in automatic speech recognition , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[17]  B. Delgutte,et al.  Speech coding in the auditory nerve: I. Vowel-like sounds. , 1984, The Journal of the Acoustical Society of America.

[18]  S. A. Shamma Neural Networks for Speech Processing and Recognition. , 1987 .

[19]  Stephanie Seneff A joint synchrony/mean-rate model of auditory speech processing , 1990 .

[21]  C. Daniel Geisler,et al.  Representation of speech sounds in the auditory nerve , 1988 .

[22]  S. Shamma,et al.  Synchrony suppression in complex stimulus responses of a biophysical model of the cochlea. , 1987, The Journal of the Acoustical Society of America.