Optimisation of HMM topology and its model parameters by genetic algorithms

Abstract Hidden Markov model (HMM) is currently the most popular approach to speech recognition. However, the problems of finding a good HMM model and its optimised model parameters are still of great interest to the researchers in this area. In our previous work, we have successfully applied the genetic algorithm (GA) to the HMM training process to obtain the optimised model parameters (Chau et al. Proc. ICASSP (1997) 1727) of the HMM models. In this paper, we further extend our work and propose a new training method based on GA and Baum–Welch algorithms to obtain an HMM model with optimised number of states in the HMM models and its model parameters. In this work, we are not only able to overcome the shortcomings of the slow convergence speed of the simple GA-HMM approach. In addition, this method also finds better number of states in the HMM topology as well as its model parameters. From our experiments with the 100 words extracted from the TIMIT corpus, our method is able to find the optimal topology in all cases. In addition, the HMMs trained by our GA HMM training have a better recognition capability than the HMMs trained by the Baum–Welch algorithm. In addition, 290 words are randomly selected from the TMIIT database for testing the recognition performances of both approaches, it is found that the GA-HMM approach has a recognition rate of 95.86% while the Baum–Welch method has a recognition rate of 93.1%. This implies that the HMMs trained by our GA-HMM method are more optimised than the HMMs trained by the Baum–Welch method.

[1]  Lalit R. Bahl,et al.  Automatic recognition of continuously spoken sentences from a finite state grammer , 1978, ICASSP.

[2]  Sam Kwong,et al.  Genetic algorithms and their applications , 1996, IEEE Signal Process. Mag..

[3]  J. Baker,et al.  The DRAGON system--An overview , 1975 .

[4]  L. Baum,et al.  An inequality with applications to statistical estimation for probabilistic functions of Markov processes and to a model for ecology , 1967 .

[5]  Sam Kwong,et al.  Genetic Algorithms : Concepts and Designs , 1998 .

[6]  L. Baum,et al.  An inequality and associated maximization technique in statistical estimation of probabilistic functions of a Markov process , 1972 .

[7]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[8]  Philip C. Woodland,et al.  Spontaneous speech recognition for the credit card corpus using the HTK toolkit , 1994, IEEE Trans. Speech Audio Process..

[9]  Kim Fung Man,et al.  Genetic algorithms for control and signal processing , 1997, Proceedings of the IECON'97 23rd International Conference on Industrial Electronics, Control, and Instrumentation (Cat. No.97CH36066).

[10]  L. Baum,et al.  A Maximization Technique Occurring in the Statistical Analysis of Probabilistic Functions of Markov Chains , 1970 .

[11]  Frederick Jelinek,et al.  The development of an experimental discrete dictation recognizer , 1985 .

[12]  F. Jelinek,et al.  Continuous speech recognition by statistical methods , 1976, Proceedings of the IEEE.

[13]  R. Bakis Continuous speech recognition via centisecond acoustic states , 1976 .

[14]  L. R. Rabiner,et al.  An introduction to the application of the theory of probabilistic functions of a Markov process to automatic speech recognition , 1983, The Bell System Technical Journal.

[15]  Wayne H. Ward,et al.  Speech recognition , 1997 .