Formants in automatic speech recognition

This paper concerns the use of formant frequency information in automatic speech recognition. The discussion is addressed to the physical significance of the formant and to how this relates to the phonetic concepts of segment and equivalence that are needed for the recognition of phonetic types. Specifically, the definition of the phone in terms of articulatory dynamics can be interpreted acoustically in terms of formant dynamics. Hence formant :transition information can aid segmentation. Also, formant frequencies for given utterances by single speakers display remarkable interrepetition stability, while the speaker identity, phonetic type, and the phonetic, prosodic, and linguistic contexts are sources of nonrandom variability that should be included in a complete acoustic phonetic description of formant behavior.

[1]  S. Öhman Coarticulation in VCV Utterances: Spectrographic Measurements , 1966 .

[2]  G. E. Peterson,et al.  A physiological theory of phonetics. , 1966, Journal of speech and hearing research.

[3]  Lawrence R. Rabiner,et al.  Computer synthesis of speech by concatenation of formant-coded words , 1971 .

[4]  G. E. Peterson,et al.  Transitions, Glides, and Diphthongs , 1961 .

[5]  B. S. Atal,et al.  Determination of the Vocal‐Tract Shape Directly from the Speech Wave , 1970 .

[6]  W. Koenig,et al.  The Sound Spectrograph * , 2011 .

[7]  David J. Broad Basic directions in automatic speech recognition , 1972 .

[8]  D. Broad,et al.  Formant-frequency trajectories in selected CVC-syllable nuclei. , 1970, The Journal of the Acoustical Society of America.

[9]  J. C. Steinberg,et al.  Toward the Specification of Speech , 1950 .

[10]  W. Klein,et al.  Vowel spectra, vowel spaces, and vowel identification. , 1970, The Journal of the Acoustical Society of America.

[11]  Gordon E. Peterson,et al.  The Representation of Vowels and Their Movements , 1948 .

[12]  G. E. Peterson,et al.  Control Methods Used in a Study of the Vowels , 1951 .

[13]  B. Lindblom,et al.  Acoustical consequences of lip, tongue, jaw, and larynx movement. , 1970, The Journal of the Acoustical Society of America.

[14]  G. E. Peterson,et al.  The Information‐Bearing Elements of Speech , 1952 .

[15]  M. Schroeder Determination of the geometry of the human vocal tract by acoustic measurements. , 1967, The Journal of the Acoustical Society of America.

[16]  J. Flanagan A Difference Limen for Vowel Formant Frequency , 1955 .