SmartKom: adaptive and flexible multimodal access to multiple applications

The development of an intelligent user interface that supports multimodal access to multiple applications is a challenging task. In this paper we present a generic multimodal interface system where the user interacts with an anthropomorphic personalized interface agent using speech and natural gestures. The knowledge-based and uniform approach of SmartKom enables us to realize a comprehensive system that understands imprecise, ambiguous, or incomplete multimodal input and generates coordinated, cohesive, and coherent multimodal presentations for three scenarios, currently addressing more than 50 different functionalities of 14 applications. We demonstrate the main ideas in a walk through the main processing steps from modality fusion to modality fission.

[1]  Wolfgang Wahlster,et al.  Verbmobil: Foundations of Speech-to-Speech Translation , 2000, Artificial Intelligence.

[2]  Nicole Beringer,et al.  PROMISE - A Procedure for Multimodal Interactive System Evaluation , 2002 .

[3]  Wolfgang Wahlster,et al.  Readings in Intelligent User Interfaces , 1998 .

[4]  Susanne Salmon Alt Référence et dialogue finalisé : de la linguistique à un modèle opérationnel , 2001 .

[5]  Iryna Gurevych,et al.  Automatic Creation of Interface Specifications from Ontologies , 2003, HLT-NAACL 2003.

[6]  Susann LuperFoy,et al.  The Representation of Multimodal User Interface Dialogues Using Discourse Pegs , 1992, ACL.

[7]  Jeffrey Nichols,et al.  Flexi-modal and multi-machine user interfaces , 2002, Proceedings. Fourth IEEE International Conference on Multimodal Interfaces.

[8]  Sharon L. Oviatt,et al.  Perceptual user interfaces: multimodal interfaces that process what comes naturally , 2000, CACM.

[9]  Susanne Salmon-Alt Référence et dialogue finalisé : de la linguistique à un modèle opérationnel , 2001 .

[10]  Philip R. Cohen,et al.  MULTIMODAL INTERFACES THAT PROCESS WHAT COMES NATURALLY , 2000 .

[11]  Peter Poller,et al.  Situated Delegation-Oriented Multimodal Presentation in SmartKom , 2002 .

[12]  Antje Schweitzer,et al.  Restricted unlimited domain synthesis , 2003, INTERSPEECH.

[13]  Aravind K. Joshi,et al.  Tree-Adjoining Grammars , 1997, Handbook of Formal Languages.

[14]  Ralf Engel,et al.  SPIN: language understanding for spoken dialogue systems using a production system approach , 2002, INTERSPEECH.

[15]  Marilyn A. Walker,et al.  MATCH: An Architecture for Multimodal Dialogue Systems , 2002, ACL.

[16]  Wolfgang Wahlster,et al.  Smartkom: multimodal communication with a life- like character , 2001, INTERSPEECH.

[17]  Owen Rambow,et al.  Tree adjoining grammars : formalisms, linguistic analysis, and processing , 2000 .

[18]  Gerd Herzog,et al.  MULTIPLATFORM Testbed: An Integration Platform for Multimodal Dialog Systems , 2003, HLT-NAACL 2003.

[19]  Peter Poller,et al.  Distributed audio-visual speech synchronization , 2002, INTERSPEECH.

[20]  Paul Dalsgaard,et al.  Proceedings of the 7th International Conference on Spoken Language Processing (ICSLP'2002), Denver-Colorado, USA, September 2002 , 2002 .

[21]  Jan Alexandersson,et al.  A Robust and Generic Discourse Model for Multimodal Dialogue , 2003 .