Extracting formants from short segments of speech using group delay functions

Speech is a non-stationary signal, with the shape of the vocal tract changing over several pitch periods, and also within the open and closed glottis phases. The effect of these changes is reflected in the locations of the formants which correspond to the resonant frequencies of the vocal tract. To observe these changes, the analysis window should be small enough (relative to a pitch period), and appropriately anchored. A non-model based method is proposed in this paper to accurately determine formants from short segments (less than a pitch period) of speech signals. It makes use of high resolution properties of group delay function to estimate formants from segments of duration less than a pitch period. The main advantage of this method is its lack of dependence on the parameters of a model. Analysis segments are synchronised with instants of glottal closure, to increase the robustness of formant extraction. Since continuity or additional acoustic-phonetic knowledge are not used, this method is fairly reliable and robust.

[1]  Stephen A. Dyer,et al.  Digital signal processing , 2018, 8th International Multitopic Conference, 2004. Proceedings of INMIC 2004..

[2]  B. Yegnanarayana,et al.  Significance of group delay functions in signal reconstruction from spectral magnitude or phase , 1984 .

[3]  Bayya Yegnanarayana,et al.  A nonparametric method of formant estimation using group delay spectra , 1989, International Conference on Acoustics, Speech, and Signal Processing,.

[4]  Kuldip K. Paliwal,et al.  Product of power spectrum and group delay function for speech recognition , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[5]  M. Schroeder Period histogram and product spectrum: new methods for fundamental-frequency measurement. , 1968, The Journal of the Acoustical Society of America.

[6]  Christophe d'Alessandro,et al.  Appropriate windowing for group delay analysis and roots of z-transform of speech signals , 2004, 2004 12th European Signal Processing Conference.

[7]  Christophe d'Alessandro,et al.  Improved differential phase spectrum processing for formant tracking , 2004, INTERSPEECH.

[8]  Bayya Yegnanarayana,et al.  Formant extraction from group delay function , 1991, Speech Commun..

[9]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[10]  B. Yegnanarayana Formant extraction from linear‐prediction phase spectra , 1978 .

[11]  Bayya Yegnanarayana,et al.  Determination of instants of significant excitation in speech using group delay function , 1995, IEEE Trans. Speech Audio Process..