Using Graphical Models for an Intelligent Mixed-Initiative Dialog Management System

The main goal of dialog management is to provide all information needed to perform e. g. a SQL-query, a navigation task, etc. Two principal approaches for dialog management systems exist: system directed ones and mixed-initiative ones. In this paper, we combine both approaches mentioned above in a novel way, and address the problem of natural intuitive dialog management. The objective of our approach is to provide a natural dialog flow. The whole dialog is therefore represented in a finite state machine: the information gathered during the dialog is represented in the states of the finite state machine; the transitions within the state machine denote the dialog steps into which the dialog is separated. The information is obtained from each natural spoken sentence by hierarchical decoding into tags, e. g. the name-tag and the address-tag. These information tags are gathered during the dialog; either by human initiative or by distinct questioning by the dialog manager. The models use information from the semantic information tags, the dialog history, and the training corpus. From all these integrated parts we achieve the best path to the end of the dialog by Viterbi decoding through the transition network after each information step. From the Air Travel Information System (ATIS) database, we extract all 21650 naturally spoken questions and the SQL-queries as answers for the trainings phase. The experiments have been realized on 200 automatically generated dialog sentences. The system obtains the semantic information in all test-sentences and leads the dialogs successfully to the end. In 66.5% of the sample dialogs we achieve the minimum of the required dialog steps. Hence, 33.5% of the dialogs have over-length.

[1]  George R. Doddington,et al.  The ATIS Spoken Language Systems Pilot Corpus , 1990, HLT.

[2]  Michael F. McTear,et al.  Spoken Dialogue Technology , 2004, Springer London.

[3]  Steve J. Young,et al.  USING POMDPS FOR DIALOG MANAGEMENT , 2006, 2006 IEEE Spoken Language Technology Workshop.

[4]  Oliver Lemon,et al.  Using Machine Learning to Explore Human Multimodal Clarification Strategies , 2006, ACL.

[5]  Geoffrey Zweig,et al.  The graphical models toolkit: An open source software system for speech and time-series processing , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[6]  Gerhard Rigoll,et al.  Combining statistical and syntactical systems for spoken language understanding with graphical models , 2008, INTERSPEECH.

[7]  L. Baum,et al.  Statistical Inference for Probabilistic Functions of Finite State Markov Chains , 1966 .

[8]  Roberto Pieraccini,et al.  Using Markov decision process for learning dialogue strategies , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[9]  Van Nostrand,et al.  Error Bounds for Convolutional Codes and an Asymptotically Optimum Decoding Algorithm , 1967 .