Tightly integrated spoken language understanding using word-to-concept translation

This paper discusses an integrated spoken language understanding method using a statistical translation model from words to semantic concepts. The translation model is an N-gram-based model that can easily be integrated with speech recognition. It can be trained using annotated corpora where only sentencelevel alignments between word sequences and concept sets are available, by automatic alignment based on cooccurrence between words and concepts. It can reduce the effort for explicitly aligning words to the corresponding concept. The method determines the confidence of understanding hypotheses for rejection in a similar manner to word-posterior-based confidence scoring in speech recognition. Experimental results show the advantages of integration over a cascaded method of speech recognition and word-to-concept translation in spoken language understanding with confidence-based rejection.

[1]  Wayne H. Ward,et al.  Integrating semantic constraints into the Sphinx-II recognition search , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[2]  Wayne H. Ward,et al.  Dialog-context dependent language modeling combining n-grams and stochastic context-free grammars , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[3]  Hermann Ney,et al.  Natural language understanding using statistical machine translation , 2001, INTERSPEECH.

[4]  Frédéric Béchet,et al.  Semantic interpretation with error correction , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[5]  Gökhan Tür,et al.  Improving spoken language understanding using word confusion networks , 2002, INTERSPEECH.

[6]  James R. Glass,et al.  Speechbuilder: facilitating spoken dialogue system development , 2001, INTERSPEECH.

[7]  Salim Roukos,et al.  Statistical natural language understanding using hidden clumpings , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[8]  Robert L. Mercer,et al.  The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.

[9]  Hermann Ney,et al.  Confidence measures for large vocabulary continuous speech recognition , 2001, IEEE Trans. Speech Audio Process..

[10]  Dilek Z. Hakkani-Tür,et al.  Detecting and extracting named entities from spontaneous speech in a mixed-initiative spoken dialogue context: How May I Help You?sm, tm , 2004, Speech Commun..

[11]  Sadaoki Furui,et al.  Hybrid Statistical and Structural Semantic Modeling for Thai Multi-Stage Spoken Language Understanding , 2004, HLT-NAACL 2004.

[12]  Kiyohiro Shikano,et al.  Real-time word confidence scoring using local posterior probabilities on tree trellis search , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[13]  Roberto Pieraccini,et al.  Concept-based spontaneous speech understanding system , 1995, EUROSPEECH.

[14]  Nick Jui-Chang Wang,et al.  Integrating multiple layers of concept information into n-gram modeling for spoken language understanding , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[15]  Wayne H. Ward,et al.  A concept graph based confidence measure , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[16]  Gökhan Tür,et al.  Extending boosting for call classification using word confusion networks , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[17]  Stephanie Seneff,et al.  TINA: A Natural Language System for Spoken Language Applications , 1992, Comput. Linguistics.