In this paper we present a template independent knowledge source (KS), that uses coarse phonetic information to substantially constrain the candidate vocabulary for use in word hypothesization with very large vocabularies. It consists of three parts: the segmenter that breaks a test utterance up into a sequence of coarse phonetic classes, the knowledge compiler that generates a reference dictionary containing the appropriate coarse phonetic representations for each word candidate and finally, a matching engine. Coarse phonetic classification is performed using linear discriminant analysis, more specifically perceptron classification. The knowledge compiler first generates a phonemic representation and segmental durations by rule from a list of word candidates (i.e., from text), and then derives coarse phonetic class segments. Matching is performed by a nonlinear time alignment algorithm based on dissimilarity scores between detected and lexical coarse class segments. The coarse phonetic KS was tested by compiling a word list of approximately 1500 words. Using only the coarse classes Silence, Plosive, Fricative, Vocalic, Front Vowel, Back Vowel, Nasal and R, a vocabulary reduction to 5% of the original vocabulary is achieved at lower than 5% error rate for three different speakers.
[1]
Victor Zue,et al.
Properties of large lexicons: Implications for advanced isolated word recognition systems
,
1982,
ICASSP.
[2]
Alexander H. Waibel,et al.
Towards Very Large Vocabulary Word Recognition
,
1982
.
[3]
Shozo Makino,et al.
Recognition of consonant based on the perceptron model
,
1983,
ICASSP.
[4]
Aaron E. Rosenberg,et al.
Demisyllable-based isolated word recognition system
,
1983
.
[5]
Shozo Makino,et al.
A speaker independent word recognition system based on phoneme recognition for a large size (212 words) vocabulary
,
1984,
ICASSP.
[6]
F. Itakura,et al.
Minimum prediction residual principle applied to speech recognition
,
1975
.
[7]
N. Dixon,et al.
A hierarchical decision approach to large-vocabulary discrete utterance recognition
,
1983
.
[8]
Alexander H. Waibel.
Suprasegmentals in very large vocabulary isolated word recognition
,
1984,
ICASSP.