Optimal acoustic and language model weights for minimizing word verification errors

Generalized word posterior probability (GWPP), a co nfidence measure for verifying recognized words, needs to eq ualize and weight acoustic and language model likelihood contr ibu ions to minimize verification errors. In this study, we investigate the word verification error surface and use it to o ptimize these weights and the corresponding verification threshol d in a development set. We test three different search alg orithms for finding the optimal parameters, including: a full g rid search, a gradient-based steepest descent search, and a downh ill simplex search. The three search methods yield very similar solutions. Proper acoustic and language model weights, especia lly the ratio between them, changes with the relative impor tance (reliability) between the two knowledge sources. Fo r a narrow beam width, the role of the acoustic model is less critical than language model in GWPP-based word verification, whi ch is due to the noisy acoustic information maintained in a arrow beam. Using a large vocabulary continuous Japanese spe ch database (Basic Travel Expression Corpus), the larg st relative improvement obtained is 33.2% for confidence error ate and 38.7% for a modified word accuracy.