A general framework for incremental processing of multimodal inputs

Humans employ different information channels (modalities) such as speech, pictures and gestures in their communication. It is believed that some of these modalities are more error-prone to some specific type of data and therefore multimodality can help to reduce ambiguities in the interaction. There have been numerous efforts in implementing multimodal interfaces for computers and robots. Yet, there is no general standard framework for developing them. In this paper we propose a general framework for implementing multimodal interfaces. It is designed to perform natural language understanding, multi- modal integration and semantic analysis with an incremental pipeline and includes a multimodal grammar language, which is used for multimodal presentation and semantic meaning generation.

[1]  Michael Johnston,et al.  Finite-state multimodal integration and understanding , 2005, Natural Language Engineering.

[2]  G. Altmann,et al.  The time-course of prediction in incremental sentence processing: Evidence from anticipatory eye-movements , 2003 .

[3]  Ivan Marsic,et al.  A framework for rapid development of multimodal interfaces , 2003, ICMI '03.

[4]  Gabriel Skantze,et al.  A General, Abstract Model of Incremental Dialogue Processing , 2011 .

[5]  Ephraim P. Glinert,et al.  Multimodal Integration , 1996, IEEE Multim..

[6]  Sharon L. Oviatt,et al.  Ten myths of multimodal interaction , 1999, Commun. ACM.

[7]  Sharon L. Oviatt,et al.  Unification-based Multimodal Integration , 1997, ACL.

[8]  Michael Johnston,et al.  Unification-based Multimodal Parsing , 1998, ACL.

[9]  Matthias Scheutz,et al.  Incremental natural language processing for HRI , 2007, 2007 2nd ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[10]  Alexander H. Waibel,et al.  Natural human-robot interaction using speech, head pose and gestures , 2004, 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No.04CH37566).

[11]  Heedong Ko,et al.  Spatial ontology for semantic integration in 3D multimodal interaction framework , 2006, VRCIA '06.