Mobile Multimodal Dialogue Systems

Mobile multimodal dialogue systems allow the user and the system to adapt their choice of input and output modality according to various technical and cognitive resource limitations and the task at hand. We present the multimodal dialogue system SmartKom, that can be used as mobile travel companion for car drivers and pedestrians. SmartKom combines speech, gestures, and facial expressions for input and output. It provides an anthropomorphic and affective interface through its personification of an interface agent. SmartKom features the situated understanding of possibly incomplete or ambiguous input and the generation of coordinated and adaptive multimodal output. The mutual disambiguation of modalities and the resolution of multimodal anaphora are based on a three-tiered discourse model, that consists of a domain, a discourse and a modality layer. We show that a multimodal dialogue system must not only understand and represent the user’s input in a modality-free way, but also its own multimodal output. We argue that intelligent multimodal interfaces are key to the consumers’ acceptance of new location-based web services for 3G UMTS smartphones and present some industrial spin-off products of the SmartKom consortium..