A fast algorithm for computing the vocal-tract impulse response from the transfer function

This paper describes a fast algorithm that computes the impulse response of the vocal tract from its transfer function. First, numerical methods for computing the transfer function of a given vocal-tract configuration are briefly outlined. These methods include techniques (1) to decompose the numerator and denominator of the transfer function and (2) to efficiently determine the resonance modes of the vocal tract. Next, is a description of how to calculate residues at the poles and how to express the vocal-tract transfer function as a partial fraction expansion series. Each term in the expansion corresponds to an elementary formant generator, and the additive terms correspond to a parallel formant architecture. A second-order digital filter is derived for each formant generator. The impulse response of the vocal tract can therefore be specified compactly by a set of such filters. Good agreement is observed between the directly calculated transfer function and the one synthesized by the proposed algorithm. The algorithm is being used in the articulatory speech synthesizer under development both at Rutgers University and at the Royal Institute of Technology, Sweden. An ambitious goal is to incorporate the method into a text-to-speech synthesizer and/or an adaptive voice mimic system.

[1]  D. G. Childers,et al.  Articulatory synthesis: nasal sounds and male and female voices , 1991 .

[2]  J. Flanagan,et al.  Signal models for low bit‐rate coding of speech , 1980 .

[3]  Gunnar Fant,et al.  Acoustic Theory Of Speech Production , 1960 .

[4]  Q. Lin,et al.  An articulatory speech synthesizer based on a frequency-domain simulation of the vocal tract , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[5]  Unto K. Laine Higher pole correction in vocal tract models and terminal analogs , 1988, Speech Commun..

[6]  Dennis H. Klatt,et al.  Software for a cascade/parallel formant synthesizer , 1980 .

[7]  J. N. Holmes,et al.  Formant synthesizers: Cascade or parallel? , 1983, Speech Commun..

[8]  J. Flanagan,et al.  Self-oscillating source for vocal-tract synthesizers , 1968 .

[9]  K. Stevens,et al.  Development of a Quantitative Description of Vowel Articulation , 1955 .

[10]  Man Mohan Sondhi,et al.  A hybrid time-frequency domain articulatory speech synthesizer , 1987, IEEE Trans. Acoust. Speech Signal Process..

[11]  John Nicholas Holmes,et al.  Speech synthesis , 1972 .

[12]  J. Sundberg,et al.  Acoustic Properties of the Nasal Tract , 1976, Phonetica.

[13]  Gunnar Fant,et al.  What can basic research contribute to speech synthesis , 1991 .

[14]  J. Flanagan,et al.  Synthesis of voiced sounds from a two-mass model of the vocal cords , 1972 .

[15]  J. Flanagan Speech Analysis, Synthesis and Perception , 1971 .

[16]  W. Strong,et al.  A model for the synthesis of natural sounding vowels , 1983 .

[17]  Man Mohan Sondhi,et al.  Vector quantization of the articulatory space , 1988, IEEE Trans. Acoust. Speech Signal Process..

[18]  G. Fant,et al.  A new algorithm for speech synthesis based on vocal tract modeling , 1990 .