Analysis of vocal pulses in a speech signal

An algorithm for estimating the vocal pulse positions and durations in an actual speech signal is described. Testing of the algorithm shows that it outperforms the best of the competitor algorithms in accuracy on the average by a factor of two. The algorithm is less sensitive to spectrum distortions in telephone channels, to various types of noise, and to instability in duration and amplitude of pulses produced by the voice source. The accuracy of the pulse position estimate is sufficient for a synchronous speech signal analysis, while the speed of signal processing makes the algorithm suitable for real-time operation.

[1]  A. Noll Cepstrum pitch determination. , 1967, The Journal of the Acoustical Society of America.

[2]  T. Irino,et al.  Robust and accurate fundamental frequency estimation based on dominant harmonic components. , 2004, The Journal of the Acoustical Society of America.

[3]  Philip C. Loizou,et al.  COLEA: A MATLAB software tool for speech analysis , 1998 .

[4]  Hermann Ney A Time Warping Approach to Fundamental Period Estimation , 1982, IEEE Transactions on Systems, Man, and Cybernetics.

[5]  Hajime Kobayashi,et al.  Weighted autocorrelation for pitch extraction of noisy speech , 2001, IEEE Trans. Speech Audio Process..

[6]  M. Ross,et al.  Average magnitude difference function pitch extractor , 1974 .

[7]  Wolfgang Hess,et al.  Pitch Determination of Speech Signals , 1983 .

[8]  Hideki Kawahara,et al.  Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds , 1999, Speech Commun..

[9]  Ray Meddis,et al.  Virtual pitch and phase sensitivity of a computer model of the auditory periphery , 1991 .

[10]  Chin-Teng Lin,et al.  Fundamental frequency estimation based on the joint time-frequency analysis of harmonic spectral structure , 2001, IEEE Trans. Speech Audio Process..

[11]  Ronald W. Schafer,et al.  Digital Processing of Speech Signals , 1978 .

[12]  V. Smirnov,et al.  A course of higher mathematics , 1964 .

[13]  Hideki Kawahara,et al.  YIN, a fundamental frequency estimator for speech and music. , 2002, The Journal of the Acoustical Society of America.