Fuzzy Vector Quantization on the Modeling of Discrete Hidden Markov Model for Speech Recognition

This paper applies fuzzy vector quantization (FVQ) to the modeling of Discrete Hidden Markov Model (DHMM) and then to improve the speech recognition rate for the Mandarin speech. Vector quantization based on a codebook is a fundamental process to recognize the speech signal by DHMM. A codebook will be first trained by K-means algorithms using Mandarin training speech. Then, based on the trained codebook, the speech features are quantized by the fuzzy sets defined on each vectors of the codebook. Subsequently, the quantized speech features are statistically applied to train the model of DHMM for the speech recognition. All the speech features to be recognized should go through the FVQ based on the fuzzy codebook before being fed into the DHMM model for recognition. Experimental results in this paper shows that the speech recognition rate can be improved by using FVQ algorithm to train the model of DHMM.

[1]  Yoshihiko Abe,et al.  Capacity Adaptive Synchronized Acoustic Steganography Scheme , 2010, J. Inf. Hiding Multim. Signal Process..

[2]  Siham Ouamour,et al.  Proposal of a New Confidence Parameter Estimating the Number of Speakers -An experimental investigation- , 2010, J. Inf. Hiding Multim. Signal Process..

[3]  Darryl Stewart,et al.  Subband correlation and robust speech recognition , 2005, IEEE Transactions on Speech and Audio Processing.

[4]  T. Kinjo,et al.  On HMM Speech Recognition Based on Complex Speech Analysis , 2006, IECON 2006 - 32nd Annual Conference on IEEE Industrial Electronics.

[5]  Witold Pedrycz,et al.  Fuzzy vector quantization with the particle swarm optimization: A study in fuzzy granulation-degranulation information processing , 2007, Signal Process..

[6]  Truong Q. Nguyen,et al.  Integer fast Fourier transform , 2002, IEEE Trans. Signal Process..

[7]  Ronald L. Rivest,et al.  Introduction to Algorithms , 1990 .

[8]  Kambiz Badie,et al.  Corrections to "Reducing the Semantic Gap of the MRI Image Retrieval Systems Using a Fuzzy Rule Based Technique" , 2009 .

[9]  Vikas Kapoor,et al.  Location Selection - A Fuzzy Clustering Approach , 2008 .

[10]  Jianhua Tao,et al.  Realistic Visual Speech Synthesis Based on Hybrid Concatenation Method , 2009, IEEE Transactions on Audio, Speech, and Language Processing.

[11]  Munetoshi Iwakiri,et al.  Real-Time Audio Watermarking Based on Characteristics of PCM in Digital Instrument , 2010, J. Inf. Hiding Multim. Signal Process..

[12]  Bor-Chen Kuo,et al.  A Novel Fuzzy Weighted C-Means Method for Image Classification , 2008 .

[13]  Alan V. Oppenheim,et al.  Discrete-time signal processing (2nd ed.) , 1999 .

[14]  Sam Kwong,et al.  Analysis of parallel genetic algorithms on HMM based speech recognition system , 1997 .

[15]  Lili Liu,et al.  Research and improvement on embedded system application of DTW-based speech recognition , 2008, 2008 2nd International Conference on Anti-counterfeiting, Security and Identification.

[16]  Clifford Stein,et al.  Introduction to Algorithms, 2nd edition. , 2001 .

[17]  Alex Acero,et al.  Spoken Language Processing: A Guide to Theory, Algorithm and System Development , 2001 .