Formant estimation of speech signals using subspace-based spectral analysis

The objective of this paper is to propose a signal processing scheme that employs subspace-based spectral analysis for the purpose of formant estimation of speech signals. Specifically, the scheme is based on decimative spectral estimation that uses Eigenanalysis and SVD (Singular Value Decomposition). The underlying model assumes a decomposition of the processed signal into complex damped sinusoids. In the case of formant tracking, the algorithm is applied on a small amount of the autocorrelation coefficients of a speech frame. The proposed scheme is evaluated on both artificial and real speech utterances from the TIMIT database. For the first case, comparative results to standard methods are provided which indicate that the proposed methodology successfully estimates formant trajectories.

[1]  L. Rabiner,et al.  System for automatic formant analysis of voiced speech. , 1970, The Journal of the Acoustical Society of America.

[2]  George Carayannis,et al.  Decimation and SVD to estimate exponentially damped sinusoids in the presence of noise , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[3]  D. Talkin Speech formant trajectory estimation using dynamic programming with modulated transition costs , 1987 .

[4]  George Carayannis,et al.  Pitch detection based on zero-phase filtering , 1989, Speech Commun..

[5]  Joseph P. Olive,et al.  Formant tracking using context-dependent phonemic information , 2005, IEEE Transactions on Speech and Audio Processing.

[6]  G. Carayannis,et al.  On the analysis of autocorrelation function for speech spectra estimation - application for nasality detection , 1977 .

[7]  Yu Shi,et al.  Spectrogram-based formant tracking via particle filters , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[8]  Ioannis Dologlou,et al.  On the use of a decimative spectral estimation method based on eigenanalysis and SVD for formant and bandwidth tracking of speech signals , 2005, INTERSPEECH.

[9]  George Carayannis,et al.  A New Decimative Spectral Estimation Method with Unconstrained Model Order and Decimation Factor , 2002 .

[10]  Ian C. Bruce,et al.  Robust Formant Tracking for Continuous Speech With Speaker Variability , 2003, IEEE Transactions on Audio, Speech, and Language Processing.

[11]  Petros Maragos,et al.  Speech formant frequency and bandwidth tracking using multiband energy demodulation , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[12]  Ramdas Kumaresan,et al.  On decomposing speech into modulated components , 2000, IEEE Trans. Speech Audio Process..

[13]  Dennis H. Klatt,et al.  Software for a cascade/parallel formant synthesizer , 1980 .

[14]  Christophe d'Alessandro,et al.  Improved differential phase spectrum processing for formant tracking , 2004, INTERSPEECH.