A time-warping framework for speech turbulence-noise component estimation during aperiodic phonation

The accurate estimation of turbulence noise affects many areas of speech processing including separate modification of the noise component, analysis of degree of speech aspiration for treating pathological voice, the automatic labeling of speech voicing, as well as speaker characterization and recognition. Previous work in the literature has provided methods by which such a high-quality noise component may be estimated in near-periodic speech, but it is known that these methods tend to leak aperiodic phonation (with even slight deviations from periodicity) into the noise-component estimate. In this paper, we improve upon existing algorithms in conditions of aperiodicity by introducing a time-warping based approach to speech noise-component estimation, demonstrating the results on both natural and synthetic speech examples.

[1]  Patrick A. Naylor,et al.  The SIGMA Algorithm: A Glottal Activity Detector for Electroglottographic Signals , 2009, IEEE Transactions on Audio, Speech, and Language Processing.

[2]  Vladimir Cuperman,et al.  Robust voicing estimation with dynamic time warping , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[3]  R. J. Mammone,et al.  New speech enhancement techniques using the pitch mode modulation model , 1993, Proceedings of 36th Midwest Symposium on Circuits and Systems.

[4]  J. Smith,et al.  A Sound Decomposition System Based on a Deterministic plus Residual Model , 1990 .

[5]  P. Murphy,et al.  Perturbation-free measurement of the harmonics-to-noise ratio in voice signals using pitch synchronous harmonic analysis. , 1999, The Journal of the Acoustical Society of America.

[6]  Bayya Yegnanarayana,et al.  Decomposition of speech signals into deterministic and stochastic components , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[7]  Philip J. B. Jackson,et al.  Pitch-scaled estimation of simultaneous voiced and turbulence-noise components in speech , 2001, IEEE Trans. Speech Audio Process..