In this paper we propose a new approach to language modeling based on dynamic Bayesian networks. The principle idea of our approach is to find the dependence relations between variables that represent different linguistic units (word, class, concept, ...) that constitutes a language model. In the context of this paper the linguistic units that we consider are syntactic classes and words. Our approach should not be considered as a model combination technique. Rather, it is an original and coherent methodology that processes words and classes in the same model. We attempt to identify and model the dependence of words and classes on their linguistic context. Our ultimate goal is to devise an automatic mechanism that extracts the best dependence relations between a word and its context, i.e., lexical and syntactic. Preliminary results are very encouraging, in particular the model in which a word depends not only on previous word but also on syntactic classes of two previous words. This model outperforms the bi-gram model.
[1]
Khalid Daoudi,et al.
Structural learning of dynamic Bayesian networks in speech recognition
,
2001,
INTERSPEECH.
[2]
Frederick Jelinek,et al.
Interpolated estimation of Markov source parameters from sparse data
,
1980
.
[3]
Hermann Ney,et al.
On structuring probabilistic dependences in stochastic language modelling
,
1994,
Comput. Speech Lang..
[4]
Kamel Smaïli,et al.
Automatic and manual clustering for large vocabulary speech recognition: a comparative study
,
1999,
EUROSPEECH.
[5]
David Heckerman,et al.
A Tutorial on Learning with Bayesian Networks
,
1998,
Learning in Graphical Models.
[6]
Kevin P. Murphy,et al.
Learning the Structure of Dynamic Probabilistic Networks
,
1998,
UAI.
[7]
Ronald Rosenfeld,et al.
Adaptive Statistical Language Modeling; A Maximum Entropy Approach
,
1994
.