Location and classification of plosive consonants using expert knowledge and neural net classifiers

A rule‐based segmentation and broad classification algorithm [R. A. Cole and L. Hou, Proc. ICASSP 88, 453–456 (1988)] located over 95% of segments labeled [b], [d], [g], [p], [t], [k], [ch], [jh], [dr], and [q] (glottal stop) before sonorants in utterances of the DARPA TIMIT database. Artificial neural net (ANN) classifiers were trained to discriminate among the labels using perceptually motivated features. In one condition, 37 feature measurements were used to describe (a) the averaged spectrum during the 15 ms following the release burst, (b) zero crossings and peak‐to‐peak amplitude contours in the region of the segment, (c) the duration of the segment, and (d) the amplitude of the plosive burst. In a second condition, an additional 16 features were used to characterize the averaged spectrum during the first 30 ms of the sonorant following the plosive. The ANN classifiers consisted of either 37 or 53 input units, 30 hidden units, and 1 output unit for each category. Classifiers were trained using backp...