An improved SIFT method for pitch estimation of speech

This paper presents an improved SIFT (Simplified Inverse Filtering Technique) method for accurate pith estimation. In order to save computing time as well as ensuring the precision of autocorrelation, different re-sampling ratios are utilized during the process of LPC (Linear Predictive Coding) coefficients analysis and inverse filtering respectively; Furthermore, for the sake to satisfy the range and accuracy of pitch frequency simultaneously, Hamming-weighting is adopted for searching the reliable peak value on the autocorrelation curves, and a four-point non-linear pitch-smoothing algorithm is designed to avoid incoherent errors for an example in transient speech frames. Finally, the smoothed pitch contour is extracted and time-normalized pitch frequencies are calculated, which can then be used as the feature of a speech utterance in speech recognition or speaker recognition systems. Further Experiments show that the present method for pitch estimation of speech has good performance.

[1]  Tetsuya Shimamura,et al.  Pitch determination using aligned AMDF , 2006, INTERSPEECH.

[2]  Gang Xu,et al.  Pitch estimation based on Circular AMDF , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[3]  T. Irino,et al.  Robust and accurate fundamental frequency estimation based on dominant harmonic components. , 2004, The Journal of the Acoustical Society of America.

[4]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[5]  R.G. Amado,et al.  Pitch detection algorithms based on zero-cross rate and autocorrelation function for musical notes , 2008, 2008 International Conference on Audio, Language and Image Processing.

[6]  Bing Luo,et al.  Comparison of PCA and ICA in Face Recognition , 2008, 2008 International Conference on Apperceiving Computing and Intelligence Analysis.

[7]  Jeng-Shyang Pan,et al.  Efficient algorithms for speech pitch estimation , 2001, Proceedings of 2001 International Symposium on Intelligent Multimedia, Video and Speech Processing. ISIMP 2001 (IEEE Cat. No.01EX489).

[8]  Hajime Kobayashi,et al.  Weighted autocorrelation for pitch extraction of noisy speech , 2001, IEEE Trans. Speech Audio Process..

[9]  G. Muhammad,et al.  Noise Robust Pitch Detection Based on Extended AMDF , 2008, 2008 IEEE International Symposium on Signal Processing and Information Technology.

[10]  Petr Pollák,et al.  Direct time domain fundamental frequency estimation of speech in noisy conditions , 2004, 2004 12th European Signal Processing Conference.