Automatic glottal closed-phase location and analysis by Kalman filtering

In an effort to develop techniques that enhance data-driven techniques in speaker characterisation for speech synthesis, this paper describes a method for automatically determining the location of the closed phase (CP) of the glottal cycle, with subsequent linear predictive (LP) analysis on the CP speech data. Our approach to detecting the CP is designed with the intention of excluding intervals that are not within the CP rather that accurately locating the instants of glottal closure and opening. The indicator used is the log determinant of the Kalman filter (KF) estimate error covariance matrix. The CP LP analysis applies a Kalman filter to the CP data only by treating the openphase data as “missing” and harnessing the non-independence of neighbouring CP spectra. The Kalman filtering process in both techniques is refined to accommodate smoothing, Kalman parameter re-estimation, handling of missing data, and estimation robustification.

[1]  D. Veeneman,et al.  Automatic glottal inverse filtering from speech and electroglottographic signals , 1985, IEEE Trans. Acoust. Speech Signal Process..

[2]  John E. Markel,et al.  Linear Prediction of Speech , 1976, Communication and Cybernetics.

[3]  Branko D. Kovacevic,et al.  Robust recursive AR speech analysis , 1995, Signal Process..

[4]  I.J. Cox,et al.  Recursive tracking of formants in speech signals , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[5]  R. Shumway,et al.  AN APPROACH TO TIME SERIES SMOOTHING AND FORECASTING USING THE EM ALGORITHM , 1982 .

[6]  C. Striebel,et al.  On the maximum likelihood estimates for linear dynamic systems , 1965 .

[7]  A. Gray,et al.  Least squares glottal inverse filtering from the acoustic speech waveform , 1979 .

[8]  Gunnar Fant,et al.  Some problems in voice source analysis , 1993, Speech Commun..

[9]  Harvey F. Silverman,et al.  A time-varying analysis method for rapid transitions in speech , 1991, IEEE Trans. Signal Process..

[10]  D. Childers,et al.  Two-channel speech analysis , 1986, IEEE Trans. Acoust. Speech Signal Process..

[11]  John Nicholas Holmes,et al.  Speech synthesis , 1972 .

[12]  D G Childers,et al.  Vocal quality factors: analysis, synthesis, and perception. , 1991, The Journal of the Acoustical Society of America.

[13]  R. E. Kalman,et al.  A New Approach to Linear Filtering and Prediction Problems , 2002 .

[14]  John McKenna,et al.  Tailoring kalman filtering towards speaker characterisation , 1999, EUROSPEECH.

[15]  Harvey F. Silverman,et al.  A model for nonstationary analysis of speech , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[16]  Koeng-Mo Sung,et al.  On robust Kalman filtering with forgetting factor for sequential speech analysis , 1997, Signal Process..

[17]  G. Rigoll A new algorithm for estimation of formant trajectories directly from the speech signal based on an extended Kalman-filter , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[18]  Raymond N. J. Veldhuis,et al.  Extraction of vocal-tract system characteristics from speech signals , 1998, IEEE Trans. Speech Audio Process..

[19]  H. Strube Determination of the instant of glottal closure from the speech wave. , 1974, The Journal of the Acoustical Society of America.