论文信息 - The hinge between input and output: understanding the multimodal input fusion results in an agent-based multimodal presentation system

The hinge between input and output: understanding the multimodal input fusion results in an agent-based multimodal presentation system

A multimodal interface provides multiple modalities for input and output, such as speech, eye gaze and facial expression. With the recent progresses in multimodal interfaces, various approaches about multimodal input fusion and output generation have been proposed. However, less attention has been paid to how to integrate them together in a multimodal input and output system. This paper proposes an approach, termed as THE HINGE, in providing agent-based multimodal presentations in accordance with multimodal input fusion results. The analysis of experiment result shows the proposed approach enhances the flexibility of the system while maintains its stability.

[1] Frank Rudzicz. Put a grammar here: bi-directional parsing in multimodal interaction , 2006, CHI EA '06.

[2] Rainer Stiefelhagen,et al. Implementation and evaluation of a constraint-based multimodal fusion system for speech and 3D pointing gestures , 2004, ICMI '04.

[3] Michael Johnston,et al. Finite-state multimodal integration and understanding , 2005, Natural Language Engineering.

[4] Mitsuru Ishizuka,et al. A Novel Method for Automatically Generating Multi-Modal Dialogue from Text , 2007, Int. J. Semantic Comput..

[5] Fang Chen,et al. An Efficient Multimodal Language Processor for Parallel Input Strings in Multimodal Input Fusion , 2007, International Conference on Semantic Computing (ICSC 2007).

[6] Mitsuru Ishizuka,et al. MPML3D: A Reactive Framework for the Multimodal Presentation Markup Language , 2006, IVA.

[7] Jason Baldridge,et al. Coupling CCG and Hybrid Logic Dependency Semantics , 2002, ACL.

[8] Patrick Blackburn,et al. Representation, Reasoning, and Relational Structures: a Hybrid Logic Manifesto , 2000, Log. J. IGPL.

[9] Wolfgang Wahlster,et al. Intelligent Interactive Entertainment Grand Challenges , 2006, IEEE Intelligent Systems.

[10] Fang Chen,et al. An efficient unification-based multimodal language processor in multimodal input fusion , 2007, OZCHI '07.