Context-dependent probabilistic hierarchical sublexical modelling using finite state transducers

This paper describes a unified architecture for integrating sub-lexical models with speech recognition, and a layered framework for context-dependent probabilistic hierarchical sublexical modelling. Previous work [1, 2, 3] has demonstrated the effectiveness of sub-lexical modelling using a core context-free grammar (CFG) augmented with context-dependent probabilistic models. Our major motivation for designing a unified architecture is to provide a framework such that probabilistic sublexical components can be integrated with other speech recognition components without sacrificing the fle xibilities of their independent developments and configurations. At the same time, we are able to obtain a tightly coupled interface between recognizers and sub-lexical linguistic components. We also present a view of using layered probabilistic models to augment CFGs. It captures context-dependent probabilistic information beyond the standard CFG formalism, and provides the fle xibility of developing suitable probabilistic models independently for each sub-lexical layer. Experimental results show that the context-dependent probabilistic hierarchical sub-lexical modelling approach can achieve comparable performance to pronunciation network approaches on utterances that contain only in-vocabulary words, while being able to substantially reduce errors on utterances with previously unseen words.

[1]  James R. Glass,et al.  A probabilistic framework for feature-based speech recognition , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[2]  Victor Zue,et al.  JUPlTER: a telephone-based conversational interface for weather information , 2000, IEEE Trans. Speech Audio Process..

[3]  Stephanie Seneff,et al.  ANGIE: a new framework for speech analysis based on morpho-phonological modelling , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[4]  Stephanie Seneff,et al.  A unified framework for sublexical and linguistic modelling supporting flexible vocabulary speech understanding , 1998, ICSLP.

[5]  Grace Chung A three-stage solution for flexible vocabulary speech understanding , 2000, INTERSPEECH.

[6]  Victor Zue,et al.  Sub-lexical modelling using a finite state transducer framework , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[7]  MohriMehryar,et al.  Weighted finite-state transducers in speech recognition , 2002 .