CONTEXTUAL WORD AND SYLLABLE PRONUNCIATION MODELS

This work focuses on the evaluation of models of syllable and word pronunciations constructed automatically using the Broadcast News corpus of radio and television news reports. Previous work [4] introduced the concept of extended-length decision tree models; here I report on ASR-independent assessment of these models. This study also discusses integration of static and dynamic pronunciation evaluation using the ROVER algorithm for combining hypotheses, and details the improvements of dynamic pronunciation evaluation on the 1998 DARPA Broadcast News test set. The new pronunciation models improve system robustness for speech that is not pre-planned and recorded under studio conditions; these models appear to represent both linguistic variation (as in spontaneous speech) and variation due to channel effects in telephone-bandwidth speech.