Speech analysis and synthesis using a glottal excited AR model with DTW-based glottal determination

In this paper we present, a new method for determining the excitation waveform of a glottal excited speech synthesizer. In the glottal excited speech model, voiced speech is decomposed into a parametric glottal shaped excitation signal and an AR vocal tract filter. The new glottal determination method can be used iteratively with vocal tract filter updates in model analysis. In each iteration the new glottal pulse positioning is determined at sub-sample resolution using by a Multi-Dimensional Dynamic Time Warping (MD-DTW) algorithm, using an analysis-by-synthesis approach to minimize the squared-error between the original and the synthetic speech.

[1]  D G Childers,et al.  Speech synthesis by glottal excited linear prediction. , 1994, The Journal of the Acoustical Society of America.

[2]  A. Rosenberg Effect of glottal pulse shape on the quality of natural vowels. , 1969 .

[3]  Hiroya Fujisaki,et al.  Estimation of voice source and vocal tract parameters based on ARMA analysis and a model for the Glottal source waveform , 1987, ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[4]  David Malah,et al.  Dynamic time warping with path control and non-local cost , 1994, Proceedings of the 12th IAPR International Conference on Pattern Recognition, Vol. 2 - Conference B: Computer Vision & Image Processing. (Cat. No.94CH3440-5).

[5]  A. Rosenberg Effect of glottal pulse shape on the quality of natural vowels. , 1969, The Journal of the Acoustical Society of America.

[6]  A. Gray,et al.  Least squares glottal inverse filtering from the acoustic speech waveform , 1979 .

[7]  Hiroya Fujisaki,et al.  Proposal and evaluation of models for the glottal source waveform , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[8]  Per Hedelin High quality glottal LPC-vocoding , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[9]  Paavo Alku,et al.  Glottal wave analysis with Pitch Synchronous Iterative Adaptive Inverse Filtering , 1991, Speech Commun..