Effect of different sampling rates and feature vector sizes on speech recognition performance

We conduct a systematic study to evaluate the effect of the sampling rate and feature vector size on the performance of a hidden Markov model (HMM) based speech recognizer. We investigate the use of the following two types of features: linear prediction (LP) derived cepstral coefficients (LPCC) and Mel frequency cepstral coefficients (MFCC). We demonstrate that for the LPCC front-end, the optimum sampling rate and feature vector size are 12 kHz and 14, respectively. We also show that for different sampling rates, the accuracy peaks at different sizes of the feature vector. For the MFCC front-end, the optimum feature vector size and sampling rate are 14 and 14 kHz, respectively.