English Speech Recognition System on Chip

Abstract An English speech recognition system was implemented on a chip, called speech system-on-chip (SoC). The SoC included an application specific integrated circuit with a vector accelerator to improve performance. The sub-word model based on a continuous density hidden Markov model recognition algorithm ran on a very cheap speech chip. The algorithm was a two-stage fixed-width beam-search baseline system with a variable beam-width pruning strategy and a frame-synchronous word-level pruning strategy to significantly reduce the recognition time. Tests show that this method reduces the recognition time nearly 6 fold and the memory size nearly 2 fold compared to the original system, with less than 1% accuracy degradation for a 600 word recognition task and recognition accuracy rate of about 98%.

[1]  Hoon Chung,et al.  Fast speech recognition to access a very large list of items on embedded devices , 2008, IEEE Transactions on Consumer Electronics.

[2]  Liu Run-sheng Multi-Pass Decoding Algorithm Based on a Speech Recognition Chip , 2004 .

[3]  Jia Liu,et al.  A novel speech recognition system-on-chip , 2008, 2008 International Conference on Audio, Language and Image Processing.

[4]  Georges Linarès,et al.  Reducing computational and memory cost for cellular phone embedded speech recognition system , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[5]  F. A. Westall Review of speech technologies for telecommunications , 1997 .

[6]  Tanee Demeechai,et al.  Integration of tonal knowledge into phonetic HMMs for recognition of speech in tone languages , 2000, Signal Process..

[7]  Jia Liu,et al.  An embedded system for speech recognition compression , 2005, IEEE International Symposium on Communications and Information Technology, 2005. ISCIT 2005..

[8]  Liu Jia,et al.  Single-chip speech recognition system based on 8051 microcontroller core , 2001 .

[9]  Pak-Chung Ching,et al.  Tone recognition of isolated Cantonese syllables , 1995, IEEE Trans. Speech Audio Process..

[10]  Miroslav Novak,et al.  Two-pass search strategy for large list recognition on embedded speech recognition platforms , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..