SmartKom : Symmetric Multimodality in an Adaptive and Reusable Dialogue Shell

We introduce the notion of symmetric multimodality for dialogue systems in which all input modes (eg. speech, gesture, facial expression) are also available for output, and vice versa. A dialogue system with symmetric multimodality must not only understand and represent the user's multimodal input, but also its own multimodal output. We present the SmartKom system, that provides full symmetric multimodality in a mixed-initiative dialogue system with an embodied conversational agent. SmartKom represents a new generation of multimodal dialogue systems, that deal not only with simple modality integration and synchronization, but cover the full spectrum of dialogue phenomena that are associated with symmetric multimodality (including crossmodal references, one-anaphora, and backchannelling). We show that SmartKom's plug-an-play architecture supports multiple recognizers for a single modality, eg. the user's speech signal can be processed by three unimodal recognizers in parallel (speech recognition, emotional prosody, boundary prosody). Finally, we detail SmartKom's three-tiered representation of multimodal discourse, consisting of a domain layer, a discourse layer, and a modality layer. To conclude, we discuss the economic and scientific impact of the SmartKom project, that has lead to more than 50 patents and 29 spin-off products.

[1]  Wolfgang Wahlster,et al.  User and discourse models for multimodal communication , 1991 .

[2]  Susann Luperfoy Discourse pegs: a computational analysis of context-dependent referring expressions , 1992 .

[3]  Wolfgang Wahlster,et al.  Plan-Based Integration of Natural Language and Graphics Generation , 1993, Artif. Intell..

[4]  Philip R. Cohen,et al.  QuickSet: multimodal interaction for distributed applications , 1997, MULTIMEDIA '97.

[5]  Douglas B. Moran,et al.  The Open Agent Architecture: A Framework for Building Distributed Software Systems , 1999, Appl. Artif. Intell..

[6]  Joseph Polifroni,et al.  Organization, communication, and control in the GALAXY-II conversational system , 1999, EUROSPEECH.

[7]  Elmar Nöth,et al.  The Recognition of Emotion , 2000 .

[8]  Wolfgang Wahlster,et al.  Smartkom: multimodal communication with a life- like character , 2001, INTERSPEECH.

[9]  Ian Horrocks,et al.  OIL: An Ontology Infrastructure for the Semantic Web , 2001, IEEE Intell. Syst..

[10]  Marilyn A. Walker,et al.  MATCH: An Architecture for Multimodal Dialogue Systems , 2002, ACL.

[11]  Jan Alexandersson,et al.  Making Sense of Partial , 2002 .

[12]  Florian Schiel,et al.  The SmartKom Multimodal Corpus at BAS , 2002, LREC.

[13]  Iryna Gurevych,et al.  Automatic Creation of Interface Specifications from Ontologies , 2003, HLT-NAACL 2003.

[14]  Gerd Herzog,et al.  MULTIPLATFORM Testbed: An Integration Platform for Multimodal Dialog Systems , 2003, HLT-NAACL 2003.

[15]  Yorick Wilks,et al.  Multimodal Dialogue Management in the COMIC Project , 2003 .