Voice source parameter measurement based on multi-scale analysis of electroglottographic signal

This paper deals with glottal parameter measurement from electroglottographic signal (EGG). The proposed approach is based on GCI and GOI determined by the multi-scale analysis of the EGG signal. Wavelet transform of EGG signal is done with a quadratic spline function. Wavelet coefficients calculated on different dyadic scales, show modulus maxima at localized discontinuities of the EGG signal. The detected maxima and minima correspond to the so-called GOIs and GCIs. To improve the GCI and GOI localization precision, the product of wavelet transform coefficients of three successive dyadic scales, called multi-scale product (MP), is operated. This process enhances edges and reduces noise and spurious peaks. Applying the cubic root amplitude on the multi-scale product improves the detection of weak GOI maximum and avoids the GCI misses. Applied on the Keele University database, the method brings about a good detection of GCI and GOI. Based on the GCI and GOI, voicing classification, pitch frequency and open quotient measurements are processed. The proposed voicing classification approach is evaluated with additive noise. For clean signal the performance is of 96.4%, and at SNR level of 5dB, the performance is of 93%. For the fundamental frequency and the open quotient measurement, the comparison of the MP with the DEGG, Howard (3/7), the threshold (35% and 50%), and the DECOM methods show that this new proposed approach is similar to the major methods with an improvement displayed by its lowest deviation.

[1]  Gloria Faye Boudreaux-Bartels,et al.  A comparison of a wavelet functions for pitch detection of speech signals , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[2]  D. Childers,et al.  Two-channel speech analysis , 1986, IEEE Trans. Acoust. Speech Signal Process..

[3]  M P Karnell,et al.  Synchronized videostroboscopic and electroglottographic examination of glottal opening. , 1988, The Journal of the Acoustical Society of America.

[4]  D. Childers,et al.  A critical review of electroglottography. , 1985, Critical reviews in biomedical engineering.

[5]  Fabrice Plante,et al.  A pitch extraction reference database , 1995, EUROSPEECH.

[6]  Lei Zhang,et al.  Canny edge detection enhancement by scale multiplication , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  D G Childers,et al.  Vocal quality factors: analysis, synthesis, and perception. , 1991, The Journal of the Acoustical Society of America.

[8]  Douglas A. Reynolds,et al.  Modeling of the glottal flow derivative waveform with application to speaker identification , 1999, IEEE Trans. Speech Audio Process..

[9]  Christophe d'Alessandro,et al.  Robust glottal closure detection using the wavelet transform , 1999, EUROSPEECH.

[10]  M. Rothenberg,et al.  Monitoring vocal fold abduction through vocal fold contact area. , 1988, Journal of speech and hearing research.

[11]  Shlomo Dubnov,et al.  Generalized Likelihood Ratio Test for Voiced-Unvoiced Decision in Noisy Speech Using the Harmonic Model , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[12]  G. Lindsey,et al.  Toward the Quantification of Vocal Efficiency , 1990 .

[13]  Brian M. Sadler,et al.  Analysis of Multiscale Products for Step Detection and Estimation , 1999, IEEE Trans. Inf. Theory.

[14]  Lei Zhang,et al.  Edge detection by scale multiplication in wavelet domain , 2002, Pattern Recognit. Lett..

[15]  C. Dromey,et al.  Approximations of open quotient and speed quotient from glottal airflow and EGG waveforms: effects of measurement criteria and sound pressure level. , 1998, Journal of voice : official journal of the Voice Foundation.

[16]  Noureddine Ellouze,et al.  Glottal opening instant detection from speech signal , 2004, 2004 12th European Signal Processing Conference.

[17]  D. Veeneman,et al.  Automatic glottal inverse filtering from speech and electroglottographic signals , 1985, IEEE Trans. Acoust. Speech Signal Process..

[18]  G. P. Moore,et al.  Relationships between electroglottograph, speech, and vocal cord contact. , 1984, Folia phoniatrica.

[19]  D. Howard Variation of electrolaryngographically derived closed quotient for trained and untrained adult female singers. , 1995, Journal of voice : official journal of the Voice Foundation.

[20]  A. Rosenfeld A nonlinear edge detection technique , 1970 .

[21]  S. Mallat A wavelet tour of signal processing , 1998 .

[22]  Brian M. Sadler,et al.  Optimal and wavelet-based shock wave detection and estimation , 1998 .

[23]  Stéphane Mallat,et al.  Singularity detection and processing with wavelets , 1992, IEEE Trans. Inf. Theory.

[24]  Dennis M. Healy,et al.  Wavelet transform domain filters: a spatially selective noise filtration technique , 1994, IEEE Trans. Image Process..

[25]  B. Doval,et al.  On the use of the derivative of electroglottographic signals for characterization of nonpathological phonation. , 2004, The Journal of the Acoustical Society of America.

[26]  Wonyong Sung,et al.  A statistical model-based voice activity detection , 1999, IEEE Signal Processing Letters.

[27]  Mike Brookes,et al.  Estimation of Glottal Closure Instants in Voiced Speech Using the DYPSA Algorithm , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[28]  Noureddine Ellouze,et al.  Local regularity analysis at glottal opening and closure instants in electroglottogram signal using wavelet transform modulus maxima , 2003, INTERSPEECH.