论文信息 - Statistical language modeling with a class-basedn-multigram model

Statistical language modeling with a class-basedn-multigram model

In this paper, we present a stochastic language-modeling tool which aims at retrieving variable-length phrases (multigrams), assuming n -gram dependencies between them, hence the name of the model: n -multigram. The estimation of the probability distribution of the phrases is intermixed with a phrase-clustering procedure in a way which jointly optimizes the likelihood of the data. As a result, the language data are iteratively structured at both a paradigmatic and a syntagmatic level in a fully integrated way. We evaluate the 2-multigram model as a statistical language model on ATIS, a task-oriented database consisting of air travel reservations. Experiments show that the 2-multigram model allows a reduction of 10% of the word error rate on ATIS with respect to the usual trigram model, with 25% fewer parameters than in the trigram model. In addition, we illustrate the ability of this model to merge semantically related phrases of different lengths into a common class.

Yoshinori Sagisaka | Sabine Deligne

[1] Mari Ostendorf,et al. Variable n-grams and extensions for conversational speech language modeling , 2000, IEEE Trans. Speech Audio Process..

[2] Frédéric Bimbot,et al. Introducing statistical dependencies and structural constraints in variable-length sequence models , 1996, ICGI.

[3] Yoshinori Sagisaka,et al. Variable-order N-gram generation by word-class splitting and consecutive word grouping , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[4] D. Rubin,et al. Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[5] Hermann Ney,et al. Algorithms for bigram and trigram word clustering , 1995, Speech Commun..

[6] Slava M. Katz,et al. Estimation of probabilities from sparse data for the language model component of a speech recognizer , 1987, IEEE Trans. Acoust. Speech Signal Process..

[7] Ian H. Witten,et al. The zero-frequency problem: Estimating the probabilities of novel events in adaptive text compression , 1991, IEEE Trans. Inf. Theory.

[8] Jianying Hu,et al. Language modeling using stochastic automata with variable length contexts , 1997, Comput. Speech Lang..

[9] Robert L. Mercer,et al. Class-Based n-gram Models of Natural Language , 1992, CL.

[10] R. Pieraccini,et al. Variable-length sequence modeling: multigrams , 1995, IEEE Signal Processing Letters.

[11] Jan Robin Rohlicek,et al. Statistical language modeling combining N-gram and context-free grammars , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[12] Roberto Pieraccini,et al. Stochastic automata for language modeling , 1996, Comput. Speech Lang..

[13] Thomas Niesler,et al. Variable-length categoryn-gram language models , 1999, Comput. Speech Lang..