This paper proposes a new method in which the speech recognition processing is executed framewise on the time axis by local parallel operations using the Markov random fields (MRF). There have not been many studies presented concerning the parallel execution of the speech processing. On the hand, it is anticipated that parallel processing algorithms for the recognition process proposed in this paper will be very useful in high-performance continuous speech recognition systems, for example, where a strong computational power is required.
The essence of parallel execution is to estimate the optimal state sequence by a parallel process based on the iterated conditional modes (ICM) for the given model parameters and the sequence of observed values. The local probability for the state sequence is indispensable for this purpose. It is shown that the local probability can be derived by representing the generation probability of the state sequence in a HMM (hidden Markov model) as a Gibbs distribution and calculating its conditional distribution.
The foregoing property implies that the one-sided Markov chain used in HMM can be converted into a two-sided Markov chain in the one-dimensional MRF. Through the speaker-independent digit speech recognition experiment, it is shown that the proposed parallel processing algorithm has recognition performance comparable to that of the Viterbi algorithm.
[1]
Lawrence R. Rabiner,et al.
A tutorial on hidden Markov models and selected applications in speech recognition
,
1989,
Proc. IEEE.
[2]
Donald Geman,et al.
Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images
,
1984,
IEEE Transactions on Pattern Analysis and Machine Intelligence.
[3]
J. Besag.
Spatial Interaction and the Statistical Analysis of Lattice Systems
,
1974
.
[4]
Haluk Derin,et al.
Modeling and Segmentation of Noisy and Textured Images Using Gibbs Random Fields
,
1987,
IEEE Transactions on Pattern Analysis and Machine Intelligence.
[5]
John P. Moussouris.
Gibbs and Markov random systems with constraints
,
1974
.
[6]
F. Spitzer.
Markov Random Fields and Gibbs Ensembles
,
1971
.
[7]
D. Rubin,et al.
Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper
,
1977
.
[8]
S.E. Levinson,et al.
Structural methods in automatic speech recognition
,
1985,
Proceedings of the IEEE.
[9]
J. Besag.
On the Statistical Analysis of Dirty Pictures
,
1986
.
[10]
John Makhoul,et al.
Spectral linear prediction: Properties and applications
,
1975
.