A novel approach for data fusion and dialog management in user-adapted multimodal dialog systems

Multimodal dialog systems have demonstrated a high potential for more flexible, usable and natural humancomputer interaction. These improvements are highly dependent on the fusion and dialog management processes, which respectively integrates and interprets multimedia multimodal information and decides the next system response for the current dialog state. In this paper we propose to carry out the multimodal fusion and dialog management processes at the dialog level in a single step. To do this, we describe an approach based on a statistical model that takes user's intention into account, generates a single representation obtained from the different input modalities and their confidence scores, and selects the next system action based on this representation. The paper also describes the practical application of the proposed approach to develop a multimodal dialog system providing travel and tourist information.

[1]  Zoraida Callejas,et al.  Voice Application Development for Android , 2013 .

[2]  Michael Pucher,et al.  Architecture for adaptive multimodal dialog systems based on voiceXML , 2001, INTERSPEECH.

[3]  Ingrid Zukerman,et al.  Natural Language Processing and User Modeling: Synergies and Limitations , 2001, User Modeling and User-Adapted Interaction.

[4]  Oliver Lemon,et al.  REINFORCEMENT LEARNING OF DIALOGUE STRATEGIES WITH HIERARCHICAL ABSTRACT MACHINES , 2006, 2006 IEEE Spoken Language Technology Workshop.

[5]  David Griol,et al.  A statistical approach to spoken dialog systems design and evaluation , 2008, Speech Commun..

[6]  Wlodek Zadrozny,et al.  Natural Language Assistant: A Dialog System for Online Product Recommendation , 2002, AI Mag..

[7]  Steve J. Young,et al.  A survey of statistical user simulation techniques for reinforcement-learning of dialogue management strategies , 2006, The Knowledge Engineering Review.

[8]  Joseph Polifroni,et al.  Towards the automatic generation of mixed-initiative dialogue systems from web content , 2003, INTERSPEECH.

[9]  Steve Young,et al.  The statistical approach to the design of spoken dialogue systems , 2003 .

[10]  Michael Kearns,et al.  CobotDS: a spoken dialogue system for chat , 2002, AAAI/IAAI.

[11]  James R. Lewis The Voice in the Machine: Building Computers That Understand Speech , 2012, Int. J. Hum. Comput. Interact..

[12]  Robert C. Moore Reasoning About Knowledge and Action , 1977, IJCAI.

[13]  Min-Jen Tsai The VoiceXML dialog system for the e-commerce ordering service , 2005, Proceedings of the Ninth International Conference on Computer Supported Cooperative Work in Design, 2005..

[14]  Steve J. Young,et al.  Partially observable Markov decision processes for spoken dialog systems , 2007, Comput. Speech Lang..

[15]  James R. Glass,et al.  Exploiting Context Information in Spoken Dialogue Interaction with Mobile Devices ? , 2022 .

[16]  Davide Buscaldi,et al.  SPOKEN QA BASED ON A PASSAGE RETRIEVAL ENGINE , 2006, 2006 IEEE Spoken Language Technology Workshop.

[17]  Simon Dobrisek,et al.  A voice-driven web browser for blind people , 2003, INTERSPEECH.

[18]  Ramón López-Cózar,et al.  A Methodology for Learning Optimal Dialog Strategies , 2010, TSD.

[19]  A. Koller,et al.  Speech Acts: An Essay in the Philosophy of Language , 1969 .

[20]  David Traum,et al.  The Information State Approach to Dialogue Management , 2003 .

[21]  Ramón López-Cózar,et al.  Assessment of dialogue systems by means of a new simulation technique , 2003, Speech Commun..

[22]  Steve Whittaker Interaction design: what we know and what we need to know , 2013, INTR.

[23]  Grace Chung,et al.  Developing a Flexible Spoken Dialog System Using Simulation , 2004, ACL.

[24]  Yasuhisa Niimi,et al.  Spoken Dialog System for Database Access on Internet , 1997 .

[25]  Philippe Bretier,et al.  A Rational Agent as the Kernel of a Cooperative Spoken Dialogue System: Implementing a Logical Theory of Interaction , 1996, ATAL.