Towards large vocabulary ASR on embedded platforms

In this paper we present an overview of an automatic speech recognition system implementation in the context of embedded systems. Specific challenges presented by low resource platforms will be addressed for the basic components of an ASR decoder. Our main objective is to utilize and modify the technology developed for large vocabulary ASR to achieve efficient LVCSR on embedded systems as well.

[1]  Jing Zheng,et al.  Fast hierarchical grammar optimization algorithm toward time and space efficiency , 2002, INTERSPEECH.

[2]  Michael Picheny,et al.  Robust methods for using context-dependent features and models in a continuous speech recognizer , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[3]  Hermann Ney,et al.  A comparison of two LVR search optimization techniques , 2002, INTERSPEECH.

[4]  Johan Schalkwyk,et al.  Speech recognition with dynamic grammars using finite-state transducers , 2003, INTERSPEECH.

[5]  Hans J. G. A. Dolfing,et al.  Incremental language models for speech recognition using finite-state transducers , 2001, IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01..

[6]  Michael Picheny,et al.  Speed improvement of the tree-based time asynchronous search , 2000, INTERSPEECH.

[7]  Steve Renals,et al.  Start-synchronous search for large vocabulary continuous speech recognition , 1999, IEEE Trans. Speech Audio Process..

[8]  MohriMehryar,et al.  Weighted finite-state transducers in speech recognition , 2002 .

[9]  Hermann Ney,et al.  Improved lexical tree search for large vocabulary speech recognition , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[10]  Shigeru Katagiri,et al.  Recent advances in efficient decoding combining on-line transducer composition and smoothed language model incorporation , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[11]  Miroslav Novak,et al.  Two-pass search strategy for large list recognition on embedded speech recognition platforms , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[12]  Hermann Ney,et al.  Look-ahead techniques for fast beam search , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[13]  Aaron E. Rosenberg,et al.  On the implementation of ASR algorithms for hand-held wireless mobile devices , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[14]  Fernando Pereira,et al.  Weighted finite-state transducers in speech recognition , 2002, Comput. Speech Lang..

[15]  Hermann Ney,et al.  Fast likelihood computation methods for continuous mixture densities in large vocabulary speech recognition , 1997, EUROSPEECH.

[16]  Hermann Ney,et al.  The time-conditioned approach in dynamic programming search for LVCSR , 2000, IEEE Trans. Speech Audio Process..

[17]  Benoît Maison,et al.  A robust high accuracy speech recognition system for mobile applications , 2002, IEEE Trans. Speech Audio Process..

[18]  Geoffrey Zweig,et al.  An architecture for rapid decoding of large vocabulary conversational speech , 2003, INTERSPEECH.

[19]  Mark J. F. Gales,et al.  State-based Gaussian selection in large vocabulary continuous speech recognition using HMMs , 1999, IEEE Trans. Speech Audio Process..

[20]  Xavier L. Aubert,et al.  An overview of decoding techniques for large vocabulary continuous speech recognition , 2002, Comput. Speech Lang..

[21]  O. Viikki,et al.  ASR in portable wireless devices , 2001, IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01..