Pitch determination of noisy speech using higher order statistics

The use of third-order statistics to determine the pitch of a speech signal and how they can eliminate the effect of a wide range of noises, including those generated by periodic sources, are shown. The proposed algorithm is based on the property that higher-order statistics can extract useful information about the statistics of voiced frames, and they can separate speech from noise. Third-order statistics are quite insensitive to most noises (Gaussian, sinusoidal, car noise, etc.) because these noises have a symmetric probability density function, and therefore their third-order cumulants are zero. The algorithm has been tested in noise-corrupted speech, at different levels of signal to noise ratio, and with different kinds of noise. The results show that this new algorithm gives in all the cases a much better estimation of the pitch than the conventional autocorrelation method.<<ETX>>

[1]  Jerry M. Mendel,et al.  Tutorial on higher-order statistics (spectra) in signal processing and system theory: theoretical results and some applications , 1991, Proc. IEEE.

[2]  T. Parks,et al.  Maximum likelihood pitch estimation , 1976 .

[3]  Aaron E. Rosenberg,et al.  A comparative performance study of several pitch detection algorithms , 1976 .

[4]  C. Nadeu,et al.  Pitch determination using the cepstrum of the one-sided autocorrelation sequence , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[5]  M.R. Raghuveer,et al.  Bispectrum estimation: A digital signal processing framework , 1987, Proceedings of the IEEE.

[6]  Kuldip K. Paliwal,et al.  Recognition of noisy speech using cumulant-based linear prediction analysis , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[7]  David A. Krubsack,et al.  An autocorrelation pitch detector and voicing decision with confidence measures developed for noise-corrupted speech , 1991, IEEE Trans. Signal Process..

[8]  B. Wells,et al.  Voiced/Unvoiced decision based on the bispectrum , 1985, ICASSP '85. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[9]  Wolfgang Hess,et al.  Pitch Determination of Speech Signals , 1983 .