Voice Processing by Dynamic Glottal Models with Applications to Speech Enhancement

We discuss the use of low-dimensional physical models of the voice source for speech coding and processing applications. A class of waveform-adaptive dynamic glottal models and parameter tracking procedures are illustrated. The model and analysis procedures are assessed by addressing speech encoding and enhancement, achievable by using a state space version of the dynamical model in a Extended Kalman filtering framework. The proposed method is shown to provide better SNR improvement if compared to a standard AR Kalman filtering scheme.

[1]  Juergen Schroeter,et al.  Speech coding based on physiological models of speech production , 1992 .

[2]  Carlo Drioli A flow waveform-matched low-dimensional glottal model based on physical knowledge. , 2005, The Journal of the Acoustical Society of America.

[3]  H. Fujisaki,et al.  System identification of the speech production process based on a state-space representation , 1984 .

[4]  Simon J. Godsill,et al.  Particle methods for Bayesian modeling and enhancement of speech signals , 2002, IEEE Trans. Speech Audio Process..

[5]  Qiang Fu,et al.  Robust Glottal Source Estimation Based on Joint Source-Filter Model Optimization , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[6]  Carlo Drioli Synthesis of voiced sounds by means of waveform adaptive physical models , 2003 .

[7]  Ehud Weinstein,et al.  Iterative and sequential Kalman filter-based speech enhancement algorithms , 1998, IEEE Trans. Speech Audio Process..

[8]  Kuldip K. Paliwal,et al.  A speech enhancement method based on Kalman filtering , 1987, ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[9]  Julius O. Smith,et al.  Generative Model of Voice in Noise for Structured Coding Applications , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[10]  J. Makhoul,et al.  Linear prediction: A tutorial review , 1975, Proceedings of the IEEE.

[11]  Reinhold Häb-Umbach,et al.  Iterative Speech Enhancement using a Non-Linear Dynamic State Model of Speech and its Parameters , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.