A New Method for Pitch Tracking and Voicing Decision Based on Spectral Multi-scale Analysis

This paper proposes a new method for voicing detection and pitch estimation. This method is based on the spectral analysis of the speech multi-scale product. The multi-scale product (MP) consists of making the product of the speech signal wavelet transform coefficients. The wavelet used is the quadratic spline function. The spectrum of the multi-scale product analysis reveals rays corresponding to the fundamental frequency and its harmonics. We evaluate our approach on the Keele University database. The experimental results show the effectiveness of our method comparatively to the state-of-the-art algorithms.

[1]  John S. Baras,et al.  Properties of the multiscale maxima and zero-crossings representations , 1993, IEEE Trans. Signal Process..

[2]  Aaron E. Rosenberg,et al.  A comparative performance study of several pitch detection algorithms , 1976 .

[3]  Hideki Kawahara,et al.  YIN, a fundamental frequency estimator for speech and music. , 2002, The Journal of the Acoustical Society of America.

[4]  David Talkin,et al.  A Robust Algorithm for Pitch Tracking ( RAPT ) , 2005 .

[5]  Jr. J.P. Campbell,et al.  Speaker recognition: a tutorial , 1997, Proc. IEEE.

[6]  Brendan J. Frey,et al.  A segment based probabilistic generative model of speech , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[7]  Shubha Kadambe,et al.  Application of the wavelet transform for pitch detection of speech signals , 1992, IEEE Trans. Inf. Theory.

[8]  Brian M. Sadler,et al.  Optimal and wavelet-based shock wave detection and estimation , 1998 .

[9]  David A. Krubsack,et al.  An autocorrelation pitch detector and voicing decision with confidence measures developed for noise-corrupted speech , 1991, IEEE Trans. Signal Process..

[10]  L. Liao,et al.  Algorithms for speech classification , 1999, ISSPA '99. Proceedings of the Fifth International Symposium on Signal Processing and its Applications (IEEE Cat. No.99EX359).

[11]  Lawrence K. Saul,et al.  Real-Time Pitch Determination of One or More Voices by Nonnegative Matrix Factorization , 2004, NIPS.

[12]  Donald G. Childers,et al.  Silent and voiced/unvoiced/mixed excitation (four-way) classification of speech , 1989, IEEE Trans. Acoust. Speech Signal Process..

[13]  A. Noll Cepstrum pitch determination. , 1967, The Journal of the Acoustical Society of America.

[14]  Paul C. Bagshaw,et al.  Enhanced pitch tracking and the processing of F0 contours for computer aided intonation teaching , 1993, EUROSPEECH.

[15]  Noureddine Ellouze,et al.  Electroglottographic Measures Based on GCI and GOI Detection Using Multiscale Product , 2008, Int. J. Comput. Commun. Control.

[16]  Lawrence R. Rabiner,et al.  On the use of autocorrelation analysis for pitch detection , 1977 .

[17]  P. Boersma ACCURATE SHORT-TERM ANALYSIS OF THE FUNDAMENTAL FREQUENCY AND THE HARMONICS-TO-NOISE RATIO OF A SAMPLED SOUND , 1993 .

[18]  Noureddine Ellouze,et al.  Open Quotient Measurements Based on Multiscale Product of Speech Signal Wavelet Transform , 2007, J. Electr. Comput. Eng..

[19]  Fabrice Plante,et al.  A pitch extraction reference database , 1995, EUROSPEECH.

[20]  Brian M. Sadler,et al.  Analysis of Multiscale Products for Step Detection and Estimation , 1999, IEEE Trans. Inf. Theory.

[21]  S. Mallat A wavelet tour of signal processing , 1998 .

[22]  Lawrence K. Saul,et al.  Multiband statistical learning for f/sub 0/ estimation in speech , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[23]  Géraldine Damnati,et al.  Robust speech/non-speech detection using LDA applied to MFCC , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[24]  Ramesh A. Gopinath,et al.  Wavelets and Wavelet Transforms , 1998 .

[25]  T. Shimamura,et al.  Noise-robust fundamental frequency extraction method based on band-limited amplitude spectrum , 2004, The 2004 47th Midwest Symposium on Circuits and Systems, 2004. MWSCAS '04..