On the use of autocorrelation for pitch extraction: Some statistical considerations and their application to the sift algorithm

Abstract The aim of this paper is to consider the classical pitch determination method based on the short-time autorcorrelation analysis of the speech signal. Two commonly used estimators and the effect of signal windowing are taken into account. It is shown that in both periodic and purely random cases a similar decomposition holds for the estimated autocorrelation. Such a decomposition makes it possible to foresee the relative merits of the estimators considered, at least as far as gross pitch determination errors and voicing-errors are concerned. The expected behaviour is found to be in good agreement with results obtained through applying the SIFT algorithm to real speech.

[1]  Ronald E. Crochiere,et al.  Real-time implementation of time domain harmonic scaling of speech for rate modification and coding , 1983 .

[2]  Manfred R. Schroeder Parameter estimation in speech: A lesson in unorthodoxy , 1970 .

[3]  Lawrence R. Rabiner,et al.  A pattern recognition approach to voiced-unvoiced-silence classification with applications to speech recognition , 1976 .

[4]  J. Markel,et al.  The SIFT algorithm for fundamental frequency estimation , 1972 .

[5]  Lawrence R. Rabiner,et al.  On the use of autocorrelation analysis for pitch detection , 1977 .

[6]  Sanjit K. Mitra,et al.  Design of digital notch filters , 1974 .

[7]  Aaron E. Rosenberg,et al.  A comparative performance study of several pitch detection algorithms , 1976 .

[8]  Shih-Chien Yang,et al.  A pitch extraction algorithm based on LPC inverse filtering and AMDF , 1977 .

[9]  B. Atal,et al.  Generalized Short‐Time Power Spectra and Autocorrelation Functions , 1962 .

[10]  V. E. Benes,et al.  Statistical Theory of Communication , 1960 .

[11]  Ronald W. Schafer,et al.  Real-time digital hardware pitch detector , 1976 .

[12]  Ronald W. Schafer,et al.  Digital Processing of Speech Signals , 1978 .

[13]  E. J. Hannan,et al.  Multiple time series , 1970 .

[14]  B. Gold,et al.  Digital speech networks , 1977, Proceedings of the IEEE.

[15]  John E. Markel,et al.  Linear Prediction of Speech , 1976, Communication and Cybernetics.

[16]  M. Ross,et al.  Average magnitude difference function pitch extractor , 1974 .

[17]  Wolfgang Hess,et al.  Pitch Determination of Speech Signals , 1983 .

[18]  M. Sondhi,et al.  New methods of pitch extraction , 1968 .