Center-distance continuous probability models and the distance measure

In this paper, a new statistic model named Center-Distance Continuous Probability Model (CDCPM) for speech recognition is described, which is based on Center-Distance Normal (CDN) distribution. In a CDCPM, the probability transition matrix is omitted, and the observation probability density function (PDF) in each state is in the form of embedded multiple-model (EMM) based on the Nearest Neighbour rule. The experimental results on two giant real-world Chinese speech databases and a real-world continuous-manner 2000 phrase system show that this model is a powerful one. Also, a distance measure for CDCPMs is proposed which is based on the Bayesian minimum classification error (MCE) discrimination.

[1]  L. R. Rabiner,et al.  Recognition of isolated digits using hidden Markov models with continuous mixture densities , 1985, AT&T Technical Journal.

[2]  Sadaoki Furui,et al.  Speaker-independent isolated word recognition using dynamic features of speech spectrum , 1986, IEEE Trans. Acoust. Speech Signal Process..

[3]  Chin-Hui Lee,et al.  Improved acoustic modeling with Bayesian learning , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[4]  Jr. G. Forney,et al.  Viterbi Algorithm , 1973, Encyclopedia of Machine Learning.

[5]  Xuedong Huang,et al.  Semi-continuous hidden Markov models for speech signals , 1990 .

[6]  Fang Zheng,et al.  A New Model for Speech Recognition : Center-Distance Continuous Probability Model , 2001 .

[7]  L. Baum,et al.  An inequality and associated maximization technique in statistical estimation of probabilistic functions of a Markov process , 1972 .

[8]  Lalit R. Bahl,et al.  Speech recognition with continuous-parameter hidden Markov models , 1987, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[9]  I.H.J. Nel,et al.  A real time, speaker independent, speech recognition system , 1991, COMSIG 1991 Proceedings: South African Symposium on Communications and Signal Processing.

[10]  Chin-Hui Lee,et al.  A frame-synchronous network search algorithm for connected word recognition , 1989, IEEE Trans. Acoust. Speech Signal Process..

[11]  Robert M. Gray,et al.  An Algorithm for Vector Quantizer Design , 1980, IEEE Trans. Commun..

[12]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[13]  L. R. Rabiner,et al.  A probabilistic distance measure for hidden Markov models , 1985, AT&T Technical Journal.

[14]  L. R. Rabiner,et al.  On the application of vector quantization and hidden Markov models to speaker-independent, isolated word recognition , 1983, The Bell System Technical Journal.

[15]  Lalit R. Bahl,et al.  Maximum mutual information estimation of hidden Markov model parameters for speech recognition , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[16]  J. Makhoul,et al.  Linear prediction: A tutorial review , 1975, Proceedings of the IEEE.

[17]  Andrew J. Viterbi,et al.  Error bounds for convolutional codes and an asymptotically optimum decoding algorithm , 1967, IEEE Trans. Inf. Theory.

[18]  Maurice Bellanger,et al.  Digital processing of signals , 1989 .

[19]  Biing-Hwang Juang,et al.  Mixture autoregressive hidden Markov models for speech signals , 1985, IEEE Trans. Acoust. Speech Signal Process..

[20]  Fang Zheng,et al.  A Real-World Speech Recognition System Based on CDCPMs , 2000 .