The hinge between input and output: understanding the multimodal input fusion results in an agent-based multimodal presentation system

A multimodal interface provides multiple modalities for input and output, such as speech, eye gaze and facial expression. With the recent progresses in multimodal interfaces, various approaches about multimodal input fusion and output generation have been proposed. However, less attention has been paid to how to integrate them together in a multimodal input and output system. This paper proposes an approach, termed as THE HINGE, in providing agent-based multimodal presentations in accordance with multimodal input fusion results. The analysis of experiment result shows the proposed approach enhances the flexibility of the system while maintains its stability.