VLSI for 5000-word continuous speech recognition

We have developed a VLSI chip for 5,000 word speaker-independent continuous speech recognition. This chip employs a context-dependent HMM (hidden Markov model) based speech recognition algorithm, and contains emission probability and Viterbi beam search pipelined hardware units. The feature vector for speech recognition is computed using a host processor in software in order to adopt various enhancement algorithms. The amount of internal SRAM size is minimized by moving data out to the external DRAM, and a custom DRAM controller module is designed to efficiently read and write consecutive data. The experimental result shows that the implemented system has a real-time factor of 0.77 and 0.55 using SDRAM and DDR SDRAM, respectively.