Spectral Multi-Scale Analysis for Multi-Pitch Tracking

This paper proposes a robust and accurate multi-pitch estimation method for multiple voices. This method is based on the spectral analysis of the mixture sound multi-scale product. The multi-scale product (PM) consists of making the product of wavelet transform coefficients. The wavelet used is the quadratic spline function. Simulation results showed that the proposed method can robustly estimate FOs for clean speech and for speech mixed with various interferences.

[1]  Brian M. Sadler,et al.  Optimal and wavelet-based shock wave detection and estimation , 1998 .

[2]  Noureddine Ellouze,et al.  Open Quotient Measurements Based on Multiscale Product of Speech Signal Wavelet Transform , 2007, J. Electr. Comput. Eng..

[3]  C. Burrus,et al.  Introduction to Wavelets and Wavelet Transforms: A Primer , 1997 .

[4]  Guy J. Brown,et al.  A multi-pitch tracking algorithm for noisy speech , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[5]  Noureddine Ellouze,et al.  Voice source parameter measurement based on multi-scale analysis of electroglottographic signal , 2009, Speech Commun..

[6]  Shubha Kadambe,et al.  Application of the wavelet transform for pitch detection of speech signals , 1992, IEEE Trans. Inf. Theory.

[7]  John S. Baras,et al.  Properties of the multiscale maxima and zero-crossings representations , 1993, IEEE Trans. Signal Process..

[8]  Brian M. Sadler,et al.  Analysis of Multiscale Products for Step Detection and Estimation , 1999, IEEE Trans. Inf. Theory.

[9]  S. Mallat A wavelet tour of signal processing , 1998 .

[10]  DeLiang Wang,et al.  Co-channel speaker identification using usable speech extraction based on multi-pitch tracking , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[11]  DeLiang Wang,et al.  Monaural speech segregation based on pitch tracking and amplitude modulation , 2002, IEEE Transactions on Neural Networks.

[12]  Guy J. Brown,et al.  Separation of speech from interfering sounds based on oscillatory correlation , 1999, IEEE Trans. Neural Networks.