Wavelets and Granular Analysis of Speech

The speech signal comes from the convolution of a source signal - due to the vibration of the vocal cords or to the airflow through a narrowing of the vocal tract - with the impulse response of the vocal tract. Both constituents rapidly change over time, and one usually considers that the phonetic information in the signal is mainly related to the evolution of the two or three first resonances of the vocal tract, called “formants” F1, F2, F3. The vocal cords vibrating frequency, Fo, is closely related to a perceptive quality of sounds called “pitch”.

[1]  J.-S. Lienard Speech analysis and reconstruction using short-time, elementary waveforms , 1987, ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[2]  X. Rodet Time — Domain Formant — Wave — Function Synthesis , 1984 .

[3]  C. d'Alessandro,et al.  Decomposition of the speech signal into short-time waveforms using spectral segmentation , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.