Semantic confidence measurement for spoken dialog systems

This paper proposes two methods to incorporate semantic information into word and concept level confidence measurement. The first method uses tag and extension probabilities obtained from a statistical classer and parser. The second method uses a maximum entropy based semantic structured language model to assign probabilities to each word. Incorporation of semantic features into a lattice posterior probability based confidence measure provides significant improvements compared to posterior probability when used together in an air travel reservation task. At 5% False Alarm (FA) rate relative improvements of 28% and 61% in Correct Acceptance (CA) rate are achieved for word level and concept level confidence measurements, respectively.

[1]  Michael Picheny,et al.  Turn-Based Language Modeling for spoken dialog systems , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[2]  Wayne H. Ward,et al.  A concept graph based confidence measure , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[3]  Ea-Ee Jan,et al.  The IBM conversational telephony system for financial applications , 1999, EUROSPEECH.

[4]  Rong Zhang,et al.  Is this conversation on track? , 2001, INTERSPEECH.

[5]  Michael Picheny,et al.  Semantic structured language models , 2002, INTERSPEECH.

[6]  Frederick Jelinek,et al.  Structured language modeling , 2000, Comput. Speech Lang..

[7]  Andreas Stolcke,et al.  Finding consensus among words: lattice-based word error minimization , 1999, EUROSPEECH.

[8]  Timothy J. Hazen,et al.  Recognition Confidence Scoring for Use in Speech Understanding Systems , 2000 .

[9]  James R. Glass,et al.  Confidence scoring for speech understanding systems , 1998, ICSLP.

[10]  Stephen Cox,et al.  High-level approaches to confidence estimation in speech recognition , 2002, IEEE Trans. Speech Audio Process..

[11]  Lin Lawrance Chase Error-responsive feedback mechanisms for speech recognizers , 1997 .

[12]  Hermann Ney,et al.  A comparison of word graph and n-best list based confidence measures , 1999, EUROSPEECH.

[13]  Rong Zhang,et al.  Word level confidence annotation using combinations of features , 2001, INTERSPEECH.

[14]  Wayne H. Ward,et al.  Estimating semantic confidence for spoken dialogue systems , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[15]  David M. Magerman Natural Language Parsing as Statistical Pattern Recognition , 1994, ArXiv.

[16]  Michael Picheny,et al.  Recent advances in speech recognition system for IBM DARPA communicator , 2001, INTERSPEECH.

[17]  Joseph Polifroni,et al.  Recognition confidence scoring and its use in speech understanding systems , 2002, Comput. Speech Lang..

[18]  Chalapathy Neti,et al.  Word-based confidence measures as a guide for stack search in speech recognition , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[19]  Ronald Rosenfeld,et al.  A survey of smoothing techniques for ME models , 2000, IEEE Trans. Speech Audio Process..

[20]  Ronald Rosenfeld,et al.  Adaptive Statistical Language Modeling; A Maximum Entropy Approach , 1994 .

[21]  Paolo Baggia,et al.  Specialized language models using dialogue predictions , 1996, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[22]  Wayne H. Ward,et al.  Confidence measures for spoken dialogue systems , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[23]  John D. Lafferty,et al.  Decision Tree Parsing using a Hidden Derivation Model , 1994, HLT.

[24]  Benoît Maison,et al.  Robust confidence annotation and rejection for continuous speech recognition , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).