论文信息 - Fixed-Point Arithmetic

Fixed-Point Arithmetic

There are two main requirements for embedded/mobile systems: one is low power consumption for long battery life and miniaturization, the other is low unit cost for components produced in very large numbers (cell phones, set-top boxes). Both requirements are addressed by CPU’s with integer-only arithmetic units which motivate the fixed-point arithmetic implementation of automatic speech recognition (ASR) algorithms. Large vocabulary continuous speech recognition (LVCSR) can greatly enhance the usability of devices, whose small size and typical on-the-go use hinder more traditional interfaces. The increasing computational power of embedded CPU’s will soon allow real-time LVCSR on portable and lowcost devices. This chapter reviews problems concerning the fixed-point implementation of ASR algorithms and it presents fixed-point methods yielding the same recognition accuracy of the floating-point algorithms. In particular, the chapter illustrates a practical approach to the implementation of the frame-synchronous beam-search Viterbi decoder, N-grams language models, HMM likelihood computation and mel-cepstrum front-end. The fixed-point recognizer is shown to be as accurate as the floating-point recognizer in several LVCSR experiments, on the DARPA Switchboard task, and on an AT&T proprietary task, using different types of acoustic front-ends, HMM’s and language models. Experiments on the DARPA Resource Management task, using the StrongARM-1100 206 MHz and the XScale PXA270 624 MHz CPU’s show that the fixed-point implementation enables real-time performance: the floating point recognizer, with floating-point software emulation is several times slower for the same accuracy.

Enrico Bocchieri | E. Bocchieri

[1] George Saon,et al. Maximum likelihood discriminant feature spaces , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[2] Stan Davis,et al. Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Se , 1980 .

[3] Hermann Ney,et al. Using SIMD instructions for fast likelihood calculation in LVCSR , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[4] O. Viikki,et al. ASR in portable wireless devices , 2001, IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01..

[5] Jeongsu Kim,et al. Memory and computation reduction for embedded ASR systems , 2004, INTERSPEECH.

[6] Li Lee,et al. Speaker normalization using efficient frequency warping procedures , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[7] Miroslav Novak,et al. Two-pass search strategy for large list recognition on embedded speech recognition platforms , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[8] Marcel Vasilache,et al. On a practical design of a low complexity speech recognition engine , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[9] Hynek Hermansky,et al. RASTA processing of speech , 1994, IEEE Trans. Speech Audio Process..

[10] Sebastian Stüker,et al. Rapid porting of ASR-systems to mobile devices , 2005, INTERSPEECH.

[11] Marcel Vasilache,et al. Speech recognition using HMMs with quantized parameters , 2000, INTERSPEECH.

[12] Aaron E. Rosenberg,et al. On the implementation of ASR algorithms for hand-held wireless mobile devices , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[13] Imre Kiss,et al. Comparison of low footprint acoustic modeling techniques for embedded ASR systems , 2005, INTERSPEECH.

[14] Fernando Pereira,et al. Weighted finite-state transducers in speech recognition , 2002, Comput. Speech Lang..

[15] Brian Kan-Wing Mak,et al. Subspace distribution clustering hidden Markov model , 2001, IEEE Trans. Speech Audio Process..

[16] Kai-Fu Lee,et al. Automatic Speech Recognition , 1989 .

[17] Miroslav Novak,et al. Towards large vocabulary ASR on embedded platforms , 2004, INTERSPEECH.

[18] Satoshi Takahashi,et al. On the use of scalar quantization for fast HMM computation , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[19] Xiao Li,et al. A high-speed, low-resource ASR back-end based on custom arithmetic , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[20] Yifan Gong,et al. Implementing a high accuracy speaker-independent continuous speech recognizer on a fixed-point DSP , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[21] Dmitry Zaykovskiy,et al. Survey of the Speech Recognition Techniques for Mobile Devices , 2006 .

[22] Yu-Hung Kao,et al. A low cost dynamic vocabulary speech recognizer on a GPP-DSP system , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[23] Alexander I. Rudnicky,et al. Pocketsphinx: A Free, Real-Time Continuous Speech Recognition System for Hand-Held Devices , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.