Development of a Generic Multimodal Framework for Handling Error Patterns during Human-Machine Interaction

In this contribution, we present a generic and therefore easily scalable multimodal framework for error robust processing of user interactions in various domains. The system provides a generic kernel for evaluating user inputs and additional pieces of information from situational, personal, and functional context. After an initial domain-specific configuration, the system is capable of detecting a set of error situations and patterns. In case an error is likely to occur or detected, a context-adequate dialog output is generated. For classification of the error patterns and the selection of the according dialog strategy, we have implemented a fuzzy-logic algorithm, using Mamdani controllers. The multimodal framework has been applied and evaluated in two application domains: an in-car infotainment and communication system and a 3D virtual shopping mall in a desktop PC environment. From a large user test, we have transcribed eleven error scenario contexts each consisting of 15 individual test sets, and analyzed them in an offline evaluation. In the VR domain, the rates for a correctly detected error pattern have been between 90.7% and 95.0% (86.7% up to 94.3% in the car domain). The rates for the appropriately selected error resolution strategy have been between 93.9% and 96.3% (91.0% up to 96.1% in the car domain).