Speech Recognition on an FPGA Using Discrete and Continuous Hidden Markov Models

Speech recognition is a computationally demanding task, particularly the stage which uses Viterbi decoding for converting pre-processed speech data into words or sub-word units. Any device that can reduce the load on, for example, a PC's processor, is advantageous. Hence we present FPGA implementations of the decoder based alternately on discrete and continuous hidden Markov models (HMMs) representing monophones, and demonstrate that the discrete version can process speech nearly 5,000 times real time, using just 12% of the slices of a Xilinx Virtex XCV1000, but with a lower recognition rate than the continuous implementation, which is 75 times faster than real time, and occupies 45% of the same device.

[1]  Bernd Burchard,et al.  A single chip phoneme based HMM speech recognition system for consumer applications , 2000, 2000 Digest of Technical Papers. International Conference on Consumer Electronics. Nineteenth in the Series (Cat. No.00CH37102).

[2]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[3]  Giuseppe Riccardi,et al.  How may I help you? , 1997, Speech Commun..

[4]  Makoto Shozakai Speech interface VLSI for car applications , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[5]  Steve J. Young,et al.  Large vocabulary continuous speech recognition using HTK , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[6]  Steven F. Quigley,et al.  Implementing a Hidden Markov Model Speech Recognition System in Programmable Logic , 2001, FPL.

[7]  Vassilios Digalakis,et al.  A Configurable Logic Based Architecture for Real-Time Continuous Speech Recognition Using Hidden Markov Models , 2000, J. VLSI Signal Process..

[8]  Jia Liu,et al.  Single-chip speech recognition system based on 8051 microcontroller core , 2001, IEEE Trans. Consumer Electron..

[9]  T. Horiyama,et al.  Speech recognition chip for monosyllables , 2001, Proceedings of the ASP-DAC 2001. Asia and South Pacific Design Automation Conference 2001 (Cat. No.01EX455).

[10]  Wendy J. Holmes,et al.  Speech Synthesis and Recognition , 1988 .

[11]  Steven F. Quigley,et al.  Reconfigurable Computing for Speech Recognition: Preliminary Findings , 2000, FPL.

[12]  Liu Jia,et al.  Single-chip speech recognition system based on 8051 microcontroller core , 2001 .