An Iterative Approach to Pitch – marking of Speech Signals without Electroglottographic Data

We propose an iterative approach to high–quality pitch– marking of speech recordings without the use of laryngographic data. Our method first identifies islands of pitch marks that can be determined with high confidence. These islands are then extended into neighboring regions. A second round of island identification and extension with lower quality requirements fills the remaining gaps. We evaluate this pitch–marking method against pitch–marks produced with thePraatsound analysis software [1].

[1]  Mike Brookes,et al.  The DYPSA algorithm for estimation of glottal closure instants in voiced speech , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[2]  Yung-An Kao,et al.  Pitch Marking Based on an Adaptable Filter and a Peak-Valley Estimation Method , 2001, ROCLING/IJCLCLP.

[3]  C. Wendt,et al.  Pitch determination and speech segmentation using the discrete wavelet transform , 1996, 1996 IEEE International Symposium on Circuits and Systems. Circuits and Systems Connecting the World. ISCAS 96.

[4]  Yves Kamp,et al.  A Frobenius norm approach to glottal closure detection from the speech signal , 1994, IEEE Trans. Speech Audio Process..

[5]  Paul C. Bagshaw,et al.  Enhanced pitch tracking and the processing of F0 contours for computer aided intonation teaching , 1993, EUROSPEECH.

[6]  Gloria Faye Boudreaux-Bartels,et al.  A comparison of a wavelet functions for pitch detection of speech signals , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[7]  A. Gray,et al.  Least squares glottal inverse filtering from the acoustic speech waveform , 1979 .

[8]  Lawrence R. Rabiner,et al.  On the use of autocorrelation analysis for pitch detection , 1977 .

[9]  T. Parks,et al.  Maximum likelihood pitch estimation , 1976, 1977 IEEE Conference on Decision and Control including the 16th Symposium on Adaptive Processes and A Special Symposium on Fuzzy Set Theory and Applications.

[10]  H. Strube Determination of the instant of glottal closure from the speech wave. , 1974, The Journal of the Acoustical Society of America.

[11]  P. Boersma Praat : doing phonetics by computer (version 4.4.24) , 2006 .

[12]  Jau-Hung Chen,et al.  Pitch Marking Based on an Adaptable Filter and a Peak-Valley Estimation Method , 2001, ROCLING/IJCLCLP.

[13]  Wim Sweldens,et al.  Building your own wavelets at home , 2000 .

[14]  P. Boersma ACCURATE SHORT-TERM ANALYSIS OF THE FUNDAMENTAL FREQUENCY AND THE HARMONICS-TO-NOISE RATIO OF A SAMPLED SOUND , 1993 .

[15]  B. Atal Automatic Speaker Recognition Based on Pitch Contours , 1969 .