Multimodal Fusion and Fission within the W3C MMI Architectural Pattern

The current W3C recommendation for multimodal interfaces provides a standard for the message exchange and overall structure of modality components in multimodal applications. However, the details for multimodal fusion to combine inputs coming from modality components and for multimodal fission to prepare multimodal presentations are left unspecified. This chapter provides a first analysis of possible integrations for several approaches for fusion and fission and their implications with regard to the standard.

[1]  John A. Bateman,et al.  Towards Constructive Text, Diagram, and Layout Generation for Information Presentation , 2001, Computational Linguistics.

[2]  Vladimir Pavlovic,et al.  Toward multimodal human-computer interface , 1998, Proc. IEEE.

[3]  Steven K. Feiner,et al.  Automating the generation of coordinated multimedia explanations , 1991, Computer.

[4]  Minh Tue Vo,et al.  Building an application framework for speech and pen input integration in multimodal learning interfaces , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[5]  Petros Maragos,et al.  Adaptive multimodal fusion by uncertainty compensation , 2006, INTERSPEECH.

[6]  Kuldip K. Paliwal,et al.  Information Fusion and Person Verification Using Speech & Face Information , 2002 .

[7]  Sharon L. Oviatt,et al.  Multimodal Interfaces: A Survey of Principles, Models and Frameworks , 2009, Human Machine Interaction.

[8]  Ingrid Zukerman,et al.  A mechanism for multimodal presentation planning based on agent cooperation and negotiation , 1997 .

[9]  Jing Huang,et al.  Far-Field Multimodal Speech Processing and Conversational Interaction in Smart Spaces , 2008, 2008 Hands-Free Speech Communication and Microphone Arrays.

[10]  Sharon L. Oviatt,et al.  Unification-based Multimodal Integration , 1997, ACL.

[11]  Kuldip K. Paliwal,et al.  Identity verification using speech and face information , 2004, Digit. Signal Process..

[12]  Dirk Schnelle-Walka,et al.  A Prolog Datamodel for State Chart XML , 2013, SIGDIAL Conference.

[13]  Josef Kittler,et al.  Multimodal Information Fusion , 2010 .

[14]  Staffan Larsson,et al.  Issue-based Dialogue Management , 2002 .

[15]  Max Mühlhäuser,et al.  Multimodal Fusion and Fission within W3C Standards for Nonverbal Communication with Blind Persons , 2014, ICCHP.

[16]  Ashwani Kumar,et al.  Miamm — A Multimodal Dialogue System Using Haptics , 2005 .

[17]  Norbert Reithinger,et al.  SmartKom: adaptive and flexible multimodal access to multiple applications , 2003, ICMI '03.

[18]  Sharon L. Oviatt,et al.  From members to teams to committee-a robust approach to gestural and multimodal recognition , 2002, IEEE Trans. Neural Networks.

[19]  Max Mühlhäuser,et al.  JVoiceXML as a modality component in the W3C multimodal architecture , 2013, Journal on Multimodal User Interfaces.

[20]  Yacine Bellik,et al.  A framework for the intelligent multimodal presentation of information , 2006, Signal Process..

[21]  Carlos Duarte,et al.  Building an Adaptive Multimodal Framework for Resource Constrained Systems , 2013 .

[22]  Guy Lapalme,et al.  Intentions in the Coordinated Generation of Graphics and Text from Tabular Data , 2000, Knowledge and Information Systems.