Multigrams for language identification

In our paper we present two new approaches for language identification. Both of them are based on the use of so-called multigrams, an information theoretic based observation representation. In the first approach we use multigram models for phonotactic modeling of phoneme or codebook sequences. The multigram model can be used to segment the new observation into larger units (e.g. something like words) and calculates a probability for the best segmentation. In the second approach we build a fenon recognizer using the segments of the best segmentation of the training material as “words” inside the recognition vocabulary. On the OGI test corpus and on the NIST’95 evaluation corpus we got significant improvements with this second approach in comparison to the unsupervised codebook approach when discriminating between English and German utterances.

[1]  A. Waibel,et al.  Multilingual Speech Recognition , 1997 .

[2]  Elmar Nöth,et al.  Discriminative estimation of interpolation parameters for language model classifiers , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[3]  Noam Chomsky,et al.  The Logical Structure of Linguistic Theory , 1975 .

[4]  Li Deng,et al.  A stochastic model of speech incorporating hierarchical nonstationarity , 1993, IEEE Trans. Speech Audio Process..

[5]  Jorma Rissanen,et al.  Stochastic Complexity in Statistical Inquiry , 1989, World Scientific Series in Computer Science.

[6]  Frédéric Bimbot,et al.  Language modeling by variable length sequences: theoretical formulation and evaluation of multigrams , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[7]  Frédéric Bimbot,et al.  An evaluation of temporal decomposition , 1991, EUROSPEECH.

[8]  Robert M. Gray,et al.  An Algorithm for Vector Quantizer Design , 1980, IEEE Trans. Commun..

[9]  Heinrich Niemann,et al.  Discriminative training of language model classifiers , 1999, EUROSPEECH.

[10]  Carl de Marcken,et al.  Unsupervised language acquisition , 1996, ArXiv.

[11]  Marc A. Zissman,et al.  Comparison of : Four Approaches to Automatic Language Identification of Telephone Speech , 2004 .

[12]  Heinrich Niemann,et al.  Multilingual Speech Recognition in the Context of Multilingual Information Retrieval Dialogues , 1998 .

[13]  Ernst Günter Schukat-Talamazzini,et al.  Rational interpolation of maximum likelihood predictors in stochastic language modeling , 1997, EUROSPEECH.