Applications of MFCC and Vector Quantization in speaker recognition
暂无分享,去创建一个
In speaker recognition, most of the computation originates from the likelihood computations between feature vectors of the unknown speaker and the models in the database. In this paper, we concentrate on optimizing Mel Frequency Cepstral Coefficient (MFCC) for feature extraction and Vector Quantization (VQ) for feature modeling. We reduce the number of feature vectors by pre-quantizing the test sequence prior to matching, and number of speakers by ruling out unlikely speakers during recognition process. The two important parameters, Recognition rate and minimized Average Distance between the samples, depends on the codebook size and the number of cepstral coefficients. We find, that this approach yields significant performance when the changes are made in the number of mfcc's and the codebook size. Recognition rate is found to reach upto 89% and the distortion reduced upto 69%.
[1] John H. L. Hansen,et al. Discrete-Time Processing of Speech Signals , 1993 .
[2] Dr. H. B. Kekre,et al. Speaker Identification by using Vector Quantization , 2010 .
[3] Nikos A. Vlassis,et al. The global k-means clustering algorithm , 2003, Pattern Recognit..
[4] R. Gray,et al. Vector quantization , 1984, IEEE ASSP Magazine.