An analog VLSI architecture for auditory based feature extraction

We have developed a low power analog VLSI chip for real time signal processing motivated by the principles of the human auditory system. An analog cochlear filter bank (which is implemented on the chip) decomposes the input audio signal into several frequency bands that have almost equal bandwidth on a log scale. This step is thus similar to computing the wavelet transform. The chip then computes signal energies and zero crossing time intervals of frequency components in a cochlear filter bank. The chip is intended to work as a front-end of a speech recognition system. We include experimental results on a VLSI implementation of the auditory front-end. We present speech recognition results on the TI-DIGITS database obtained from computer simulations which model the functionality of the feature extraction VLSI hardware. We use hidden Markov models (HMM) in combination with linear discriminant analysis (LDA) for the recognizer design.

[1]  Andreas G. Andreou,et al.  On Generalizations of Linear Discriminant Analysis , 1996 .

[2]  Axthonv G. Oettinger,et al.  IEEE Transactions on Information Theory , 1998 .

[3]  Charles Robert Jankowski,et al.  A comparison of auditory models for automatic speech recognition , 1992 .

[4]  Andreas G. Andreou,et al.  Application of Discriminant Analysis to Speech Recognition with Auditory Features , 1995 .

[5]  Hermann Ney,et al.  Continuous mixture densities and linear discriminant analysis for improved context-dependent acoustic models , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[6]  B. Logan Information in the zero crossings of bandpass signals , 1977, The Bell System Technical Journal.

[7]  Gert Cauwenberghs,et al.  Fault-tolerant dynamic multilevel storage in analog VLSI , 1994 .

[8]  Gert Cauwenberghs,et al.  A circuit model of hair-cell transduction for temporal processing and auditory feature extraction , 1996, 1996 IEEE International Symposium on Circuits and Systems. Circuits and Systems Connecting the World. ISCAS 96.

[9]  John Wawrzynek,et al.  Silicon Auditory Processors as Computer Peripherals , 1992, NIPS.

[10]  B. Kedem,et al.  Spectral analysis and discrimination by zero-crossings , 1986, Proceedings of the IEEE.

[11]  Dieter Geller,et al.  Improvements in connected digit recognition using linear discriminant analysis and mixture densities , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[12]  Richard F. Lyon,et al.  An analog electronic cochlea , 1988, IEEE Trans. Acoust. Speech Signal Process..

[13]  H. Ney,et al.  Linear discriminant analysis for improved large vocabulary continuous speech recognition , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[14]  Oded Ghitza,et al.  Auditory nerve representation as a front-end for speech recognition in a noisy environment , 1986 .

[15]  M. Sachs,et al.  Representation of steady-state vowels in the temporal aspects of the discharge patterns of populations of auditory-nerve fibers. , 1979, The Journal of the Acoustical Society of America.

[16]  Weimin Liu,et al.  Voiced-speech representation by an analog silicon model of the auditory periphery , 1992, IEEE Trans. Neural Networks.

[17]  Carver Mead,et al.  Analog VLSI and neural systems , 1989 .

[18]  Andreas G. Andreou,et al.  Cochlear models implemented with linearized transconductors , 1996, 1996 IEEE International Symposium on Circuits and Systems. Circuits and Systems Connecting the World. ISCAS 96.

[19]  Stéphane Mallat,et al.  Zero-crossings of a wavelet transform , 1991, IEEE Trans. Inf. Theory.

[20]  Peter F. Brown,et al.  The acoustic-modeling problem in automatic speech recognition , 1987 .