In this contribution, we present a generic and therefore easily scalable multimodal framework for error robust processing of user interactions in various domains. The system provides a generic kernel for evaluating user inputs and additional pieces of information from situational, personal, and functional context. After an initial domain-specific configuration, the system is capable of detecting a set of error situations and patterns. In case an error is likely to occur or detected, a context-adequate dialog output is generated. For classification of the error patterns and the selection of the according dialog strategy, we have implemented a fuzzy-logic algorithm, using Mamdani controllers. The multimodal framework has been applied and evaluated in two application domains: an in-car infotainment and communication system and a 3D virtual shopping mall in a desktop PC environment. From a large user test, we have transcribed eleven error scenario contexts each consisting of 15 individual test sets, and analyzed them in an offline evaluation. In the VR domain, the rates for a correctly detected error pattern have been between 90.7% and 95.0% (86.7% up to 94.3% in the car domain). The rates for the appropriately selected error resolution strategy have been between 93.9% and 96.3% (91.0% up to 96.1% in the car domain).
[1]
G. McGlaun,et al.
Ein neuer Systemansatz für die Integration multimodalen Inputs durch Late Semantic Fusion - A new Approach for Integrating Multimodal Input via Late Semantic Fusion
,
2002
.
[2]
Gerhard Rigoll,et al.
Evaluating Multimodal Interaction Patterns in Various Application Scenarios
,
2003,
Gesture Workshop.
[3]
Sharon L. Oviatt,et al.
Error resolution during multimodal human-computer interaction
,
1996,
Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.
[4]
Markku Turunen.
Error Handling in Speech User Interfaces in the Context of Virtual Worlds
,
1999
.
[5]
Donald A. Norman.
Interaction design for automobile interiors
,
2003
.