A Markov random field model for automatic speech recognition

Speech can be represented as a time/frequency distribution of energy using a multiband filter bank. A Markov random field model, which takes into account the possible time asynchrony across the bands, is estimated for each segmental units to be recognized. The law of the speech process is given by a parametric Gibbs distribution and a maximum likelihood parameter estimation algorithm is developed. Experiments are conducted on an isolated word recognition problem. It is shown that similar performances are obtained with the new model and with standard HMM techniques in the mono-band case. In the multiband case, it is shown that modeling interband synchrony is an interesting approach to increase the performance when the number of bands increases.

[1]  Hideki Noda,et al.  A MRF-based parallel processing algorithm for speech recognition using linear predictive HMM , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[2]  Gérard Chollet,et al.  Markov Random Field Modeling for Speech Recognition , 1998 .

[3]  Les Atlas,et al.  Application of the Gibbs distribution to hidden Markov modeling in speaker independent isolated word recognition , 1991, IEEE Trans. Signal Process..

[4]  Donald Geman,et al.  Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Donald Geman,et al.  Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images , 1984 .

[6]  Bernard Chalmond,et al.  An iterative Gibbsian technique for reconstruction of m-ary images , 1989, Pattern Recognit..

[7]  Roger K. Moore,et al.  Modelling asynchrony in speech using elementary single-signal decomposition , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[8]  Misha Pavel,et al.  Towards ASR on partially corrupted speech , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[9]  Gérard Chollet,et al.  Toward Markov random field modeling of speech , 1998, ICSLP.