Throat Microphone Signals for Isolated Word Recognition Using LPC

Automatic speech Recognition (ASR) is technology that allows a computer to identify the words that a person speaks into microphone or telephone. One of the most difficult problems for an automatic speech recognition system resides in dealing with noises. The performance of standard ASR systems using Normal Microphone (NM) degrades even if the ambience is slightly noisy. In this system Throat Microphone (TM) signals are used for isolated word recognition. In contrast to the NM speech the TM speech is unaffected by such an ambience. We use the linear predictive coding (LPC) spectral analysis model for speech recognition. This ASR system is designed to recognize isolated devanagari words.

[1]  Zhanyu Ma,et al.  A probabilistic principal component analysis based hidden Markov model for audio-visual speech recognition , 2008, 2008 42nd Asilomar Conference on Signals, Systems and Computers.

[2]  Climent Nadeu,et al.  Speech recognition in noisy car environment based on OSALPC representation and robust similarity measuring techniques , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[3]  André van Schaik,et al.  Linear predictive coding of speech using an analogue cochlear model , 1995, EUROSPEECH.

[4]  Yasunari Obuchi Multiple-microphone robust speech recognition using decoder-based channel selection , 2004, SAPA@INTERSPEECH.

[5]  Lawrence R. Rabiner,et al.  Microprocessor implementation of an LPC-based isolated word recognizer , 1981, ICASSP.

[6]  Bayya Yegnanarayana,et al.  Mapping Speech Spectra from Throat Microphone to Close-Speaking Microphone: A Neural Network Approach , 2007, EURASIP J. Adv. Signal Process..

[7]  Kevin P. Murphy,et al.  A coupled HMM for audio-visual speech recognition , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[8]  Engin Erzin,et al.  Improving Throat Microphone Speech Recognition by Joint Analysis of Throat and Acoustic Microphone Recordings , 2009, IEEE Transactions on Audio, Speech, and Language Processing.

[9]  Bayya Yegnanarayana,et al.  Throat microphone signal for speaker recognition , 2004, INTERSPEECH.

[10]  Zicheng Liu,et al.  Multi-sensory microphones for robust speech detection, enhancement and recognition , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.