Real-time recognition of subword units on a hybrid multi-DSP/ASIC based acoustic front-end

A description is given of the hardware and software structure of the acoustic-phonetic decoding done in real time within the speaker-adaptive continuous speech understanding system SPICOS (Siemens, Philips, IPO continuous speech recognition and understanding). SPICOS is designed as a German language man-machine dialogue interface system consisting of acoustic-phonetic decoding, linguistic analysis, dialogue-modeling, and speech-synthesis modules. The acoustic-phonetic decoding is based on an articulatory feature vector, which is used to recognize subword units with hidden Markov models (HMM). Feature extraction and recognition are supported by special hardware. For the formant extraction, 16 LPC reflection coefficients are calculated by a signal processor and mapped onto a codebook with 4000 codes containing formant hypotheses. The latter task is performed by a dedicated application-specific integrated circuit designed for vector quantization.<<ETX>>