Automatic detection of syllabic nuclei using acoustic measures

This paper describes a method for detecting syllabic nuclei from English utterances on a frame-by-frame basis using bandpass-filtered acoustic energy measurements. No knowledge of the utterance's phonetic composition is used. In the training phase, phones in English utterances read by a female speaker were assigned rank-ordered sonority values. These sonority values were predicted using multiple linear regression where the predictor variables were bandpass-filtered acoustic energy values at the phone's central region. Results show that (1) syllabic nuclei are identified at over 60 percent accuracy, and (2) speech rate, defined as syllabic nuclei per unit time, is estimated at over 80 percent accuracy.