Spoken word recognition based on top-down phoneme segmentation

This paper describes a new spoken word recognition approach based on the top-down phoneme segmentation. Fourteen phoneme recognition functions are introduced to deal with various coarticulations. This new approach has two advantages. First, a precise phoneme recognition can be achieved because of the phoneme level top-down verification. Second, only phoneme symbol sequences are required for the vocabulary knowledge source, because the coarticulation knowledge is included in phoneme level knowledge sources. Experimental recognition results for 100 city names uttered by 50 speakers indicate that the phoneme concatenations showing strong coarticulation must be segmented as a unit to achieve a high recognition rate.