Combining Zero Replacement Speech Enhancement with Lag Window Method for Pitch Detection

Pitch is one of the most essential features in human speech analysis. Although numerous pitch detection methods have been developed, it is still a challenge to provide a high pitch detection performance in noisy environments. In this paper, we propose an anti-noise pitch detection method that combines a speech enhancement algorithm with a spectral flattening algorithm. In the experiments, we compare the proposed method with several widely used or state-of-the-art pitch detection methods. The results show that the proposed method has the lowest gross pitch error (GPE) rate among all the methods when dealing with white-noise added male speeches. Moreover, comparing the pitches estimated by the proposed method to those estimated by the conventional lag window method, we can see that the speech enhancement algorithm helps diminish pitch errors.

[1]  Tetsuya Shimamura,et al.  Speech enhancement with zero replacement signal at low SNR , 2015, 2015 International Symposium on Intelligent Signal Processing and Communication Systems (ISPACS).

[2]  古井 貞煕,et al.  Digital speech processing, synthesis, and recognition , 1989 .

[3]  Lawrence R. Rabiner,et al.  On the use of autocorrelation analysis for pitch detection , 1977 .

[4]  Hideki Kawahara,et al.  YIN, a fundamental frequency estimator for speech and music. , 2002, The Journal of the Acoustical Society of America.

[5]  Hajime Kobayashi,et al.  Weighted autocorrelation for pitch extraction of noisy speech , 2001, IEEE Trans. Speech Audio Process..

[6]  J. Markel,et al.  The SIFT algorithm for fundamental frequency estimation , 1972 .

[7]  Mike Brookes,et al.  PEFAC - A Pitch Estimation Algorithm Robust to High Levels of Noise , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[8]  Tetsuya Shimamura,et al.  Noise estimation with an inverse comb filter in non-stationary noise environments , 2017, 2017 IEEE 60th International Midwest Symposium on Circuits and Systems (MWSCAS).

[9]  A. Noll Cepstrum pitch determination. , 1967, The Journal of the Acoustical Society of America.

[10]  Harald Singer,et al.  Pitch dependent phone modelling for HMM-based speech recognition , 1994 .