Novel Approach to Spoken Dialogue Management in Intelligent Environments

Intelligent Environments consist of various entities and in parallel provide and execute different tasks. Since neither the sets of entities nor the tasks may remain constant while the user interacts with the system, we speak of a changing nature of such environments. Hereunder we have revealed adaptation as the major characteristic of an SDM allowing for a consistent interface provision. We have presented several approaches to SDM that partly cover specific aspects of adaptation or adaptivity in Sect. 2.4. However, in order to develop a system that provides adaptive spoken dialogue within IEs, it is necessary to denote a general definition of adaptation. This definition concerns the three main stakeholders involved in spoken interaction: the user(s), the SDS, and the IE. Of course, the fourth party involved is the ASDM, which seems to play a key role: while the ASDM must handle adaptation, the other parties provoke it. In the following we discuss our proposed definition and provide a complete description of adaptivity regarding the stakeholders mentioned above.

[1]  Anton Batliner,et al.  Application of Speaker Classification in Human Machine Dialog Systems , 2007, Speaker Classification.

[2]  Sean Bechhofer,et al.  OWL: Web Ontology Language , 2009, Encyclopedia of Database Systems.

[3]  Kent L. Norman,et al.  Development of an instrument measuring user satisfaction of the human-computer interface , 1988, CHI '88.

[4]  R. Plutchik Emotion, a psychoevolutionary synthesis , 1980 .

[5]  Michael F. McTear,et al.  Book Review , 2005, Computational Linguistics.

[6]  Florian Schaub,et al.  Territorial privacy in ubiquitous computing , 2011, 2011 Eighth International Conference on Wireless On-Demand Network Systems and Services.

[7]  Tobias Heinroth,et al.  Spoken Interaction within the Computed World: Evaluation of a Multitasking Adaptive Spoken Dialogue System , 2011, 2011 IEEE 35th Annual Computer Software and Applications Conference.

[8]  Elisabeth André,et al.  Perception in Multimodal Dialogue Systems, 4th IEEE Tutorial and Research Workshop on Perception and Interactive Technologies for Speech-Based Systems, PIT 2008, Kloster Irsee, Germany, June 16-18, 2008, Proceedings , 2008, PIT.

[9]  Marilyn A. Walker,et al.  PARADISE: A Framework for Evaluating Spoken Dialogue Agents , 1997, ACL.

[10]  Randolph R. Cornelius,et al.  The science of emotion: Research and tradition in the psychology of emotion. , 1997 .

[11]  Stephen B. Johnson,et al.  The Legacy of Zellig Harris: Language and information into the 21st century. Volume 2: Mathematics and computability of language , 2002 .

[12]  W. Minker,et al.  Handling Emotions in Human-Computer Dialogues , 2009 .

[13]  Vladimir I. Levenshtein,et al.  Binary codes capable of correcting deletions, insertions, and reversals , 1965 .

[14]  Klara Nahrstedt,et al.  A Middleware Infrastructure for Active Spaces , 2002, IEEE Pervasive Comput..

[15]  Benjamin Michotte,et al.  USIXML: A Language Supporting Multi-path Development of User Interfaces , 2004, EHCI/DS-VIS.

[16]  Staffan Larsson,et al.  Issue-based Dialogue Management , 2002 .

[17]  Stephen Travis Pope,et al.  A cookbook for using the model-view controller user interface paradigm in Smalltalk-80 , 1988 .

[18]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[19]  Wolfgang Minker,et al.  Adaptive Speech Understanding for Intuitive Model-based Spoken Dialogues , 2012, LREC.

[20]  Richard Mankiewicz The Story of Mathematics , 2001 .

[21]  W. Russell,et al.  Continuous hidden Markov modeling for speaker-independent word spotting , 1989, International Conference on Acoustics, Speech, and Signal Processing,.

[22]  Robert Graham,et al.  Towards a tool for the Subjective Assessment of Speech System Interfaces (SASSI) , 2000, Natural Language Engineering.

[23]  Donald E. Knuth,et al.  backus normal form vs. Backus Naur form , 1964, CACM.

[24]  Mike Potel,et al.  MVP: Model-View-Presenter The Taligent Programming Model for C++ and Java , 1996 .

[25]  Gregory D. Abowd,et al.  The Georgia Tech aware home , 2008, CHI Extended Abstracts.

[26]  Victor Zue,et al.  GALAXY-II: a reference architecture for conversational system development , 1998, ICSLP.

[27]  Yolanda Gil,et al.  Towards intelligent assistance for to-do lists , 2008, IUI '08.

[28]  Kallirroi Georgila,et al.  Speech Input from Older Users in Smart Environments: Challenges and Perspectives , 2009, HCI.

[29]  Wolfgang Minker,et al.  Modeling and Predicting Quality in Spoken Human-Computer Interaction , 2011, SIGDIAL Conference.

[30]  Jesse James Garrett Ajax: A New Approach to Web Applications , 2007 .

[31]  Florian Metze,et al.  Speaker Classification for Next‐Generation Voice‐Dialog Systems , 2008 .

[32]  Susanne Biundo-Stephan,et al.  Knowledge-based Middleware as an Architecture for Planning and Scheduling Systems , 2006, ICAPS.

[33]  Maxine Eskénazi,et al.  A multi-layer architecture for semi-synchronous event-driven dialogue management , 2007, 2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU).

[34]  Alexander I. Rudnicky,et al.  The RavenClaw dialog management framework: Architecture and systems , 2009, Comput. Speech Lang..

[35]  Maxine Eskénazi,et al.  Spoken Dialog Challenge 2010: Comparison of Live and Control Test Results , 2011, SIGDIAL Conference.

[36]  Florian Metze,et al.  Getting closer: tailored human–computer speech dialog , 2009, Universal Access in the Information Society.

[37]  Ramón López-Cózar,et al.  Two-level speech recognition to enhance the performance of spoken dialogue systems , 2006, Knowl. Based Syst..

[38]  Jackson Liscombe,et al.  When calls go wrong: how to detect problematic calls based on log-files and emotions? , 2008, INTERSPEECH.

[39]  S. Young,et al.  Scaling POMDPs for Spoken Dialog Management , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[40]  S. C. Kleene,et al.  Introduction to Metamathematics , 1952 .

[41]  Gregor Bertrand,et al.  Towards Emotion, Age- and Gender-Aware VoiceXML Applications , 2009, Intelligent Environments.

[42]  Svetha Venkatesh,et al.  Activity recognition and abnormality detection with the switching hidden semi-Markov model , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[43]  Wolfgang Minker,et al.  Repair strategies on trial: which error recovery do users like best? , 2010, INTERSPEECH.

[44]  Fan Yang,et al.  An Investigation of Interruptions and Resumptions in Multi-Tasking Dialogues , 2011, Computational Linguistics.

[45]  Robin Cooper,et al.  Clarification, Ellipsis, and the Nature of Contextual Updates in Dialogue , 2004 .

[46]  Maryam Habibi,et al.  Divided POMDP method for complex menu problems in spoken dialogue systems , 2010, 2010 IEEE Spoken Language Technology Workshop.

[47]  Christos Goumopoulos,et al.  Using AI planning and late binding for managing service workflows in intelligent environments , 2011, 2011 IEEE International Conference on Pervasive Computing and Communications (PerCom).

[48]  David Garlan,et al.  Context is key , 2005, CACM.

[49]  L. Fahrmeir,et al.  Multivariate statistische Verfahren , 1984 .

[50]  Melinda T. Gervasio,et al.  What were you thinking?: filling in missing dataflow through inference in learning from demonstration , 2009, IUI.

[51]  Ramón López-Cózar,et al.  The role of spoken language dialogue interaction in intelligent environments , 2009, J. Ambient Intell. Smart Environ..

[52]  Jody J. Daniels Integrating a Spoken Language System with Agents for Operational Information Access , 2000, AAAI/IAAI.

[53]  H. B. Mann,et al.  On a Test of Whether one of Two Random Variables is Stochastically Larger than the Other , 1947 .

[54]  Shimei Pan,et al.  Designing and Evaluating an Adaptive Spoken Dialogue System , 2002, User Modeling and User-Adapted Interaction.

[55]  Gabriel Skantze,et al.  Exploring human error recovery strategies: Implications for spoken dialogue systems , 2005, Speech Communication.

[56]  D. C. McFarlane,et al.  Recovering Context After Interruption , 2019, Proceedings of the Twenty-Fourth Annual Conference of the Cognitive Science Society.

[57]  W. H. Warren The dynamics of perception and action. , 2006, Psychological review.

[58]  Steve J. Young,et al.  USING POMDPS FOR DIALOG MANAGEMENT , 2006, 2006 IEEE Spoken Language Technology Workshop.

[59]  Michael C. Mozer,et al.  Lessons from an Adaptive Home , 2005 .

[60]  C. E. SHANNON,et al.  A mathematical theory of communication , 1948, MOCO.

[61]  Hui Jiang,et al.  Confidence measures for speech recognition: A survey , 2005, Speech Commun..

[62]  Pablo A. Haya,et al.  Context Adaptive Interaction with an Automatically Created Spoken Interface for Intelligent Environments , 2004, INTELLCOMM.

[63]  Alexander I. Rudnicky,et al.  Sorry and I Didn’t Catch That! - An Investigation of Non-understanding Errors and Recovery Strategies , 2005, SIGDIAL.

[64]  Wayne H. Ward,et al.  Recent Improvements in the CMU Spoken Language Understanding System , 1994, HLT.

[65]  Noam Chomsky,et al.  Three models for the description of language , 1956, IRE Trans. Inf. Theory.

[66]  A. Turing On Computable Numbers, with an Application to the Entscheidungsproblem. , 1937 .

[67]  Brian Eberman,et al.  A Media Resource Control Protocol (MRCP) Developed by Cisco, Nuance, and Speechworks , 2006, RFC.

[68]  Christos Goumopoulos,et al.  Ambient Ecologies in Smart Homes , 2009, Comput. J..

[69]  Ramón López-Cózar,et al.  ASR post-correction for spoken dialogue systems based on semantic, syntactic, lexical and contextual information , 2008, Speech Commun..

[70]  Herbert H. Clark,et al.  Contributing to Discourse , 1989, Cogn. Sci..

[71]  Stephanie Seneff,et al.  A dynamic vocabulary spoken dialogue interface , 2004, INTERSPEECH.

[72]  M. R. Stoline The Status of Multiple Comparisons: Simultaneous Estimation of all Pairwise Comparisons in One-Way ANOVA Designs , 1981 .

[73]  G. A. Miller THE PSYCHOLOGICAL REVIEW THE MAGICAL NUMBER SEVEN, PLUS OR MINUS TWO: SOME LIMITS ON OUR CAPACITY FOR PROCESSING INFORMATION 1 , 1956 .

[74]  Michael F. McTear,et al.  Handling errors and determining confirmation strategies - An object-based approach , 2003, Speech Commun..

[75]  Jacob Eisenstein,et al.  XIML: a common representation for interaction data , 2002, IUI '02.

[76]  Milica Gasic,et al.  The Hidden Information State model: A practical framework for POMDP-based spoken dialogue management , 2010, Comput. Speech Lang..

[77]  Wolfgang Minker,et al.  Next Generation Intelligent Environments , 2011 .

[78]  Julia Hirschberg,et al.  Corrections in spoken dialogue systems , 2000, INTERSPEECH.

[79]  Gerrit C. van der Veer,et al.  An Ontology for Task World Models , 1998, DSV-IS.

[80]  W. Kruskal,et al.  Use of Ranks in One-Criterion Variance Analysis , 1952 .

[81]  Leslie Pack Kaelbling,et al.  Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..

[82]  George N. Votsis,et al.  Emotion recognition in human-computer interaction , 2001, IEEE Signal Process. Mag..

[83]  Alexander Schmitt,et al.  OwlSpeak - adaptive spoken dialogue within Intelligent Environments , 2010, 2010 8th IEEE International Conference on Pervasive Computing and Communications Workshops (PERCOM Workshops).

[84]  Esther Sena,et al.  Dialogue management in a home machine environment: linguistic components over an agent architecture , 2001, Proces. del Leng. Natural.

[85]  Fan Yang,et al.  Switching to Real-Time Tasks in Multi-Tasking Dialogue , 2008, COLING.

[86]  Staffan Larsson,et al.  Information state and dialogue management in the TRINDI dialogue move engine toolkit , 2000, Natural Language Engineering.

[87]  Alain Colmerauer,et al.  The birth of Prolog , 1996 .

[88]  Daniel C. McFarlane,et al.  Comparison of Four Primary Methods for Coordinating the Interruption of People in Human-Computer Interaction , 2002, Hum. Comput. Interact..

[89]  Bastian Könings,et al.  Privacy & Trust in Ambient Intelligence Environments , 2011 .

[90]  Markku Turunen,et al.  Adaptive Dialogue Systems - Interaction with Interact , 2002, SIGDIAL Workshop.

[91]  Helmut Feldweg,et al.  GermaNet - a Lexical-Semantic Net for German , 1997 .

[92]  Ina Wechsung,et al.  Evaluation Methods for Multimodal Systems: A Comparison of Standardized Usability Questionnaires , 2008, PIT.

[93]  Kuansan Wang,et al.  SALT: An XML Application for Web-based Multimodal Dialog Management , 2002, NLPXML@COLING.

[94]  L. Baum,et al.  A Maximization Technique Occurring in the Statistical Analysis of Probabilistic Functions of Markov Chains , 1970 .

[95]  Fabio Paternò,et al.  ConcurTaskTrees: A Diagrammatic Notation for Specifying Task Models , 1997, INTERACT.

[96]  Alexander I. Rudnicky,et al.  Olympus: an open-source framework for conversational spoken language interface research , 2007, HLT-NAACL 2007.

[97]  Nancy Green,et al.  A Constraint-Based Approach for Cooperative Information-Seeking Dialogue , 2002, INLG.

[98]  David Traum,et al.  The Information State Approach to Dialogue Management , 2003 .

[99]  Mikio Nakano,et al.  Understanding Unsegmented User Utterances in Real-Time Spoken Dialogue Systems , 1999, ACL.

[100]  Sean Bechhofer,et al.  Igniting the OWL 1.1 Touch Paper: The OWL API , 2007, OWLED.

[101]  Wolfgang Minker,et al.  Topic Switching Strategies for Spoken Dialogue Systems , 2011, INTERSPEECH.

[102]  Ramón López-Cózar,et al.  Multimodal Dialogue for Ambient Intelligence and Smart Environments , 2010, Handbook of Ambient Intelligence and Smart Environments.

[103]  James R. Lewis,et al.  IBM computer usability satisfaction questionnaires: Psychometric evaluation and instructions for use , 1995, Int. J. Hum. Comput. Interact..

[104]  IT Informatics,et al.  Backus-Naur Form , 2010 .

[105]  Axel Hildebrand,et al.  EMBASSI: Electronic multimedia and service assistance , 2000 .

[106]  Jun Hu,et al.  From events to goals: Supporting semantic interaction in smart environments , 2010, The IEEE symposium on Computers and Communications.

[107]  Alexander I. Rudnicky,et al.  Stochastic Language Generation for Spoken Dialogue Systems , 2000 .

[108]  Paul Bachmann,et al.  Die Analytische Zahlentheorie , 2022 .

[109]  Alexander I. Rudnicky,et al.  Integrating Multiple Knowledge Sources for Utterance-Level Confidence Annotation in the CMU Communicator Spoken Dialog System , 2002 .

[110]  Fred I. Dretske Explaining Behavior: Reasons in a World of Causes , 1990 .

[111]  Mary D. Swift,et al.  The Medication Advisor Project: Preliminary Report , 2002 .

[112]  Dietmar F. Rösner,et al.  Adaptive Dialogue Management in the NIMITEK Prototype System , 2008, PIT.

[113]  Gregory A. Sanders,et al.  DARPA communicator: cross-system results for the 2001 evaluation , 2002, INTERSPEECH.

[114]  Richard L. Wexelblat History of programming languages I , 1978 .

[115]  Raphael Volz,et al.  Cooking the Semantic Web with the OWL API , 2003, SEMWEB.

[116]  Diane J. Cook,et al.  A Multi-agent Approach to Controlling a Smart Environment , 2006, Designing Smart Homes.

[117]  B. Dervin,et al.  Sense-Making Methodology Reader: Selected Writings of Brenda Dervin , 2003 .

[118]  Svetlana Lockwood,et al.  Computer, light on! , 2008 .