A hierarchical language model incorporating class-dependent word models for OOV words recognition

A new language model is proposed to cope with the demands for recognizing out-of-vocabulary (OOV) words not registered in the lexicon. This language model is a class N-gram incorporating a set of word models that reflect the statistical characteristics of the phonotactics, which depend on the lexical classes. Utilization of class-dependency enhances recognition accuracy and enables identification of the class of OOV words. OOV words can be recognized as transcribed portions having class labels, which provide semantic attributes of OOV words to subsequent language processing. Experimental application of the model to Japanese personal and family names showed that it performs nearly as well as the upper bound of the in-vocabulary recognition.

[1]  I. Good THE POPULATION FREQUENCIES OF SPECIES AND THE ESTIMATION OF POPULATION PARAMETERS , 1953 .

[2]  Eduardo Lleida,et al.  Efficient decoding and training procedures for utterance verification in continuous speech recognition , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[3]  Atsushi Nakamura,et al.  Japanese speech databases for robust speech recognition , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[4]  Harald Singer,et al.  Fast word-graph generation for spontaneous conversational speech translation , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[5]  Richard M. Schwartz,et al.  Toward realtime transcription of broadcast news , 1999, EUROSPEECH.

[6]  Dietrich Klakow,et al.  OOV-detection in large vocabulary system using automatically defined word-fragments as fillers , 1999, EUROSPEECH.

[7]  Yoshinori Sagisaka,et al.  Multi-class composite N-gram based on connection direction , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[8]  Myoung-Wan Koo,et al.  An utterance verification system based on subword modeling for a vocabulary independent speech recognition system , 1999, EUROSPEECH.