The Role of Spoken Dialogue in User–Environment Interaction

Publisher Summary This chapter reviews the role of spoken dialogue in such environments. It begins by describing and comparing the functions and technical characteristics of different spoken dialogue applications, including voice control, call routing, voice search, and question answering. This is followed by a review of recent research that illustrates the current state of the art in spoken dialogue systems for ambient intelligence environments. It examines the technological requirements for spoken dialogue systems in ambient intelligence environments. Currently many commercially operational systems are able to automate a variety of customer services, such as providing flight information, weather forecasts, sports results, and share prices. They can also support transactions such as booking hotels, renting cars, making payments, or downloading ringtones for mobile phones. These systems free human operators from mundane tasks that can be easily automated and for which spoken dialogue is a natural mode of communication. There are also systems that accept spoken input to record information for data entry, such as medical reports or accident reports for insurance companies. For these systems the information interpreted by the system is transcribed either into predefined forms or rendered as a dictated document. The chapter concludes with an overview of challenges and future prospects for this emerging technology.

[1]  James F. Allen,et al.  An architecture for more realistic conversational systems , 2001, IUI '01.

[2]  Alexiei Dingli,et al.  Information Extraction Tools and Methods for Understanding Dialogue in a Companion , 2008, LREC.

[3]  Philip Hanna,et al.  Using multiple strategies to manage spoken dialogue , 2007, INTERSPEECH.

[4]  Markku Turunen,et al.  Interoperability and Knowledge Representation in Distributed Health and Fitness Companion Dialogue System , 2008, SPSCTPA@COLING.

[5]  Pau Baiget,et al.  Exploiting Natural Language Generation in Scene Interpretation , 2010, AmI 2010.

[6]  S. Young,et al.  Scaling POMDPs for Spoken Dialog Management , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[7]  Geoffrey Zweig,et al.  Live search for mobile:Web services by voice on the cellphone , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[8]  Ramón López-Cózar,et al.  ASR post-correction for spoken dialogue systems based on semantic, syntactic, lexical and contextual information , 2008, Speech Commun..

[9]  Kuldip K. Paliwal,et al.  Robust speech recognition under noisy ambient conditions , 2010, AmI 2010.

[10]  James F. Allen,et al.  Toward Conversational Human-Computer Interaction , 2001, AI Mag..

[11]  Eric Horvitz,et al.  Uncertainty, Utility, and Misunderstanding: A Decision-Theoretic Perspective on Grounding in Conversational Systems , 1999 .

[12]  Verena Rieser,et al.  The SAMMIE Corpus of Multimodal Dialogues with an MP3 Player , 2006, LREC.

[13]  Roberto Pieraccini,et al.  A stochastic model of human-machine interaction for learning dialog strategies , 2000, IEEE Trans. Speech Audio Process..

[14]  Dilek Z. Hakkani-Tür,et al.  Spoken language understanding , 2008, IEEE Signal Processing Magazine.

[15]  Sebastian Möller,et al.  Evaluating spoken dialogue systems according to de-facto standards: A case study , 2007, Comput. Speech Lang..

[16]  David Griol,et al.  A statistical approach to spoken dialog systems design and evaluation , 2008, Speech Commun..

[17]  Sebastian Varges,et al.  Evaluation of content presentation strategies for an in-car spoken dialogue system , 2006, INTERSPEECH.

[18]  Oli Mival,et al.  Introducing the companions project: intelligent, persistent, personalised interfaces to the internet , 2007 .

[19]  Junlan Feng,et al.  Speech and language processing over the web , 2008, IEEE Signal Processing Magazine.

[20]  Gökhan Tür,et al.  The AT&T spoken language understanding system , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[21]  Wolfgang Wahlster,et al.  SmartKom: Foundations of Multimodal Dialogue Systems , 2006, SmartKom.

[22]  Nati Herrasti,et al.  GENIO: an ambient intelligence application in home automation and entertainment environment , 2005, sOc-EUSAI '05.

[23]  Steve J. Young,et al.  A survey of statistical user simulation techniques for reinforcement-learning of dialogue management strategies , 2006, The Knowledge Engineering Review.

[24]  Giuseppe Riccardi,et al.  How may I help you? , 1997, Speech Commun..

[25]  Oliver Lemon,et al.  Using Machine Learning to Explore Human Multimodal Clarification Strategies , 2006, ACL.

[26]  Katsuhito Sudoh,et al.  Incorporating discourse features into confidence scoring of intention recognition results in spoken dialogue systems , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[27]  Eric Horvitz,et al.  Conversation as Action Under Uncertainty , 2000, UAI.

[28]  Matthew Purver,et al.  Robust interpretation in dialogue by combining confidence scores with contextual features , 2006, INTERSPEECH.

[29]  André Berton,et al.  How to integrate speech-operated internet information dialogs into a car , 2007, INTERSPEECH.

[30]  Jeremy H. Wright,et al.  Automatically Training a Problematic Dialogue Predictor for a Spoken Dialogue System , 2011, J. Artif. Intell. Res..

[31]  Joseph Polifroni,et al.  AN ANALYSIS OF AUTOMATIC CONTENT SELECTION ALGORITHMS FOR SPOKEN DIALOGUE SYSTEM SUMMARIES , 2006, 2006 IEEE Spoken Language Technology Workshop.

[32]  Markus Löckelt Plan-Based Dialogue Management for Multiple Cooperating Applications , 2006, SmartKom.

[33]  Jürgen te Vrugt,et al.  SmartKom-Home: The Interface to Home Entertainment , 2006, SmartKom.

[34]  Norbert Reithinger,et al.  A look under the hood: design and development of the first SmartWeb system demonstrator , 2005, ICMI '05.

[35]  Wolfgang Wahlster,et al.  Dialogue Systems Go Multimodal: The SmartKom Experience , 2006, SmartKom.

[36]  David R. Traum,et al.  CONVERSATION ACTS IN TASK‐ORIENTED SPOKEN DIALOGUE , 1992, Comput. Intell..

[37]  Kallirroi Georgila,et al.  User simulation for spoken dialogue systems: learning and evaluation , 2006, INTERSPEECH.

[38]  Stanley Peters,et al.  CHAT to Your Destination , 2007, SIGDIAL.

[39]  Victor Zue,et al.  GALAXY-II: a reference architecture for conversational system development , 1998, ICSLP.

[40]  André Berton,et al.  SmartKom-Mobile Car: User Interaction with Mobile Services in a Car Environment , 2006, SmartKom.

[41]  Carlo Curino,et al.  A data-oriented survey of context models , 2007, SGMD.

[42]  Bill N. Schilit,et al.  Context-aware computing applications , 1994, Workshop on Mobile Computing Systems and Applications.

[43]  Oliver Lemon,et al.  multithreaded context for robust conversational interfaces: Context-sensitive speech recognition and interpretation of corrective fragments , 2004, TCHI.

[44]  Juan Carlos Augusto,et al.  Ambient Intelligence: Concepts and applications , 2007, Comput. Sci. Inf. Syst..

[45]  Gary Geunbae Lee,et al.  CHAT AND GOAL-ORIENTED DIALOG TOGETHER: A UNIFIED EXAMPLE-BASED ARCHITECTURE FOR MULTI-DOMAIN DIALOG MANAGEMENT , 2006, 2006 IEEE Spoken Language Technology Workshop.

[46]  Markku Turunen,et al.  An architecture and applications for speech-based accessibility systems , 2005, IBM Syst. J..

[47]  Staffan Larsson,et al.  Information state and dialogue management in the TRINDI dialogue move engine toolkit , 2000, Natural Language Engineering.

[48]  Sebastian Möller,et al.  Chapter 14 – Evaluation of Multimodal Interfaces for Ambient Intelligence , 2010, AmI 2010.

[49]  Guillermo Pérez,et al.  Generating Multilingual Grammars from OWL Ontologies , 2006 .

[50]  Dan Jurafsky,et al.  Pragmatics and Computational Linguistics , 2008 .

[51]  Steve J. Young,et al.  Partially observable Markov decision processes for spoken dialog systems , 2007, Comput. Speech Lang..

[52]  Alexiei Dingli,et al.  The senior companion multiagent dialogue system , 2008, AAMAS.

[53]  J. Gabriel Amores,et al.  MIMUS: A Multimodal and Multilingual Dialogue System for the Home Domain , 2007, ACL.

[54]  Dong Yu,et al.  An introduction to voice search , 2008, IEEE Signal Processing Magazine.

[55]  R. Bashirullah,et al.  Technology and Signal Processing for Brain-Machine Interfaces , 2008, IEEE Signal Processing Magazine.

[56]  Norbert Reithinger,et al.  SmartWeb - Mobile Broadband Access to the Semantic Web , 2007, Künstliche Intell..

[57]  Juan Carlos Augusto,et al.  Ambient Intelligence—the Next Step for Artificial Intelligence , 2008, IEEE Intelligent Systems.