论文信息 - A coarse phonetic knowledge source for template independent large vocabulary word recognition

A coarse phonetic knowledge source for template independent large vocabulary word recognition

In this paper we present a template independent knowledge source (KS), that uses coarse phonetic information to substantially constrain the candidate vocabulary for use in word hypothesization with very large vocabularies. It consists of three parts: the segmenter that breaks a test utterance up into a sequence of coarse phonetic classes, the knowledge compiler that generates a reference dictionary containing the appropriate coarse phonetic representations for each word candidate and finally, a matching engine. Coarse phonetic classification is performed using linear discriminant analysis, more specifically perceptron classification. The knowledge compiler first generates a phonemic representation and segmental durations by rule from a list of word candidates (i.e., from text), and then derives coarse phonetic class segments. Matching is performed by a nonlinear time alignment algorithm based on dissimilarity scores between detected and lexical coarse class segments. The coarse phonetic KS was tested by compiling a word list of approximately 1500 words. Using only the coarse classes Silence, Plosive, Fricative, Vocalic, Front Vowel, Back Vowel, Nasal and R, a vocabulary reduction to 5% of the original vocabulary is achieved at lower than 5% error rate for three different speakers.

Alex Waibel | H. Lagger

[1] Victor Zue,et al. Properties of large lexicons: Implications for advanced isolated word recognition systems , 1982, ICASSP.

[2] Alexander H. Waibel,et al. Towards Very Large Vocabulary Word Recognition , 1982 .

[3] Shozo Makino,et al. Recognition of consonant based on the perceptron model , 1983, ICASSP.

[4] Aaron E. Rosenberg,et al. Demisyllable-based isolated word recognition system , 1983 .

[5] Shozo Makino,et al. A speaker independent word recognition system based on phoneme recognition for a large size (212 words) vocabulary , 1984, ICASSP.

[6] F. Itakura,et al. Minimum prediction residual principle applied to speech recognition , 1975 .

[7] N. Dixon,et al. A hierarchical decision approach to large-vocabulary discrete utterance recognition , 1983 .

[8] Alexander H. Waibel. Suprasegmentals in very large vocabulary isolated word recognition , 1984, ICASSP.