Prosodic Aids to Speech Recognition. II. Syntactic Segmentation and Stressed Syllable Location

Abstract : A strategy is outlined for acoustic aspects of speech recognition, whereby prosodic features are used to detect boundaries between phrases, then stressed syllables are located within each constituent and a partial distinctive feature analysis is done within stressed syllables. Facilities have been implemented for linear prediction, formant tracking, and extraction of fundamental frequency and speech energy contours. Experiments were conducted on the automatic detection of constituent boundaries and location of stressed syllables by analysis of fundamental frequency and energy contours, for recordings of six talkers reading the Rainbow Script, two talkers reading a paragraph composed of monosyllabic words, and ten talkers involved in speaking sentences pertinent to man-computer interaction. A program was implemented which successfully detects over 80% of all boundaries between major syntactic constituents, by use of fall-rise valleys in fundamental frequency contours. A panel of three listeners provided judgments of which syllables were stressed, unstressed, or reduced in the speech texts. Judgments from two listeners were quite consistent from time to time, and the two listeners particularly agreed with each other as to which syllables were stressed. The third listener gave less consistent results.