Retrieving phrases by selecting the history: application to automatic speech recognition

This paper focuses on statistical language modeling for automatic speech recognition. We present a method which aims at finding linguistic units in corpus. This method, called the Selected History Principle, consists in finding strong distant relationships between words. The new units are phrases made up of basic units of our vocabulary linked by these distant relationships. We adapt the multigram principle to large vocabularies in order to introduce an optimal subset of these sequences into a bigram model. The bigram model using these sequences outperforms the baseline bigram model by 21% in terms of Perplexity, and increases the recognition rate of the large vocabulary system Sirocco by 8.7%. The word error rate is decreased by 12.7%.

[1]  Slava M. Katz,et al.  Estimation of probabilities from sparse data for the language model component of a speech recognizer , 1987, IEEE Trans. Acoust. Speech Signal Process..

[2]  Guodong Zhou,et al.  Interpolation of n-gram and mutual-information based trigger pair language models for Mandarin speech recognition , 1999, Comput. Speech Lang..

[3]  Marcello Federico,et al.  Language Model Adaptation , 1999 .

[4]  Maxine Eskénazi,et al.  BREF, a large vocabulary spoken corpus for French , 1991, EUROSPEECH.

[5]  Frederick Jelinek,et al.  Structured language modeling , 2000, Comput. Speech Lang..

[6]  Hermann Ney,et al.  A word graph algorithm for large vocabulary continuous speech recognition , 1994, Comput. Speech Lang..

[7]  Mei-Yuh Hwang,et al.  The SPHINX speech recognition system , 1989, International Conference on Acoustics, Speech, and Signal Processing,.

[8]  Frédéric Bimbot,et al.  Inference of variable-length linguistic and acoustic units by multigrams , 1997, Speech Commun..

[9]  Egidio P. Giachin,et al.  Phrase bigrams for continuous speech recognition , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[10]  Frédéric Bimbot,et al.  Language modeling by variable length sequences: theoretical formulation and evaluation of multigrams , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[11]  Kamel Smaïli,et al.  A New Method Based on Context for Combining Statistical Language Models , 2001, CONTEXT.

[12]  J. Cleary,et al.  \self-organized Language Modeling for Speech Recognition". In , 1997 .