Automatic recognition of continuously spoken sentences from a finite state grammer

We report performance results on the recognition of continuously spoken sentences from the finite state grammar for the "New Raleigh Language" (vocabulary-250 words; average sentence length-8 words; entropy-2.86 bits/word; perplexity-7.27 words). Sentence and word error rates of 5% and 0.6% , respectively, are achieved, using a new centisecond-level model for the acoustic processor. We also report results for the "CMU-AIX05 Language" (vocabulary-1011 words; average sentence length-about 7 words; entropy-2.18 bits/word; perplexity-4.53 words), using both our earlier phone-level model and the centisecond-level model. With the phone-level acoustic-processor model, sentence and word error rates of 2% and 0.8%, respectively, are achieved. With the centisecond-level model, sentence and word error rates are 1% and 0.1%, respectively.