Cochlear Linear Predictive Coding: Interfacing an Analogue Cochlear Model with a Conventional Speech Recognition System

Conventional speech recognition systems require as input a sequence of acoustic vectors coding the relevant features of the speech signal. On the other hand the biological cochlea (inner ear) processes the speech signal to extract a neural coding suited for the higher levels of the brain. The background idea of this article is to interface an analogue electronic model of the cochlea with a statistical classifier based on HMM. Combined with an analogue gradient descent circuit, such an artificial cochlea can be used to extract the Linear Predictive Coding (LPC) of the speech signal in continuous time. We thus propose an analogue VLSI circuit implementing a so-called 'Cochlear LPC' (CLPC). Speech recognition results obtained using a computer model of the CLPC circuit combined with an HMM classifier are presented, which compare favourably with those obtained from a standard LPC/HMM system. The advantages of the CLPC, such as its optimal time-frequency resolution and dynamics properties, as well as the continuous time processing in general are discussed. Finally, we briefly outline alternative pre-processing options based on the artificial cochlea which are expected to increase the functional compatibility with a statistical classifier.

[1]  Ronald W. Schafer,et al.  Digital Processing of Speech Signals , 1978 .

[2]  E. Vittoz,et al.  An analogue electronic model of Ventral Cochlear Nucleus neurons , 1996, Proceedings of Fifth International Conference on Microelectronics for Neural Networks.

[3]  D. O. Kim Active and nonlinear cochlear biomechanics and the role of outer-hair-cell subsystem in the mammalian auditory system , 1986, Hearing Research.

[4]  Richard F. Lyon,et al.  Improved implementation of the silicon cochlea , 1992 .

[5]  Hervé Bourlard,et al.  Digit recognition with stochastic perceptual speech models , 1995, EUROSPEECH.

[6]  M. Ruggero Responses to sound of the basilar membrane of the mammalian cochlea , 1992, Current Opinion in Neurobiology.

[7]  M. Hasler,et al.  Influence of vector quantization on isolated word recognition , 1994 .

[8]  R. Meddis Simulation of mechanical to neural transduction in the auditory receptor. , 1986, The Journal of the Acoustical Society of America.

[9]  André van Schaik,et al.  Linear predictive coding of speech using an analogue cochlear model , 1995, EUROSPEECH.

[10]  Richard F. Lyon,et al.  An analog electronic cochlea , 1988, IEEE Trans. Acoust. Speech Signal Process..

[11]  John Wawrzynek,et al.  Systems technologies for silicon auditory models , 1994, IEEE Micro.

[12]  Bernard Widrow,et al.  Adaptive Signal Processing , 1985 .

[13]  S. Seneff A joint synchrony/mean-rate model of auditory speech processing , 1990 .

[14]  John Lazzaro,et al.  Silicon models of early audition , 1990 .

[15]  André van Schaik,et al.  Improved Silicon Cochlea using Compatible Lateral Bipolar Transistors , 1995, NIPS.

[16]  John Lazzaro,et al.  Circuit Models of Sensory Transduction in the Cochlea , 1989, Analog VLSI Implementation of Neural Systems.

[17]  John Lazzaro,et al.  Analog VLSI model of binaural hearing , 1991, IEEE Trans. Neural Networks.

[18]  M. Ruggero Responses to sound of the basilar membrane of the mammalian cochlea , 1992, Current Opinion in Neurobiology.