Benchmarking fusion engines of multimodal interactive systems

This article proposes an evaluation framework to benchmark the performance of multimodal fusion engines. The paper first introduces different concepts and techniques associated with multimodal fusion engines and further surveys recent implementations. It then discusses the importance of evaluation as a mean to assess fusion engines, not only from the user perspective, but also at a performance level. The article further proposes a benchmark and a formalism to build testbeds for assessing multimodal fusion engines. In its last section, our current fusion engine and the associated system HephaisTK are evaluated thanks to the evaluation framework proposed in this article. The article concludes with a discussion on the proposed quantitative evaluation, suggestions to build useful testbeds, and proposes some future improvements.

[1]  Mohammed Yeasin,et al.  A real-time framework for natural multimodal interaction with large screen displays , 2002, Proceedings. Fourth IEEE International Conference on Multimodal Interfaces.

[2]  Ann Blandford,et al.  Four easy pieces for assessing the usability of multimodal interaction: the CARE properties , 1995, INTERACT.

[3]  Jean Vanderdonckt,et al.  An open source workbench for prototyping multimodal interactions based on off-the-shelf heterogeneous components , 2009, EICS '09.

[4]  Sharon L. Oviatt,et al.  Toward a theory of organized multimodal integration patterns during human-computer interaction , 2003, ICMI '03.

[5]  Denis Lalanne,et al.  Strengths and weaknesses of software architectures for the rapid creation of tangible and multimodal interfaces , 2008, Tangible and Embedded Interaction.

[6]  Philip R. Cohen,et al.  QuickSet: multimodal interaction for distributed applications , 1997, MULTIMEDIA '97.

[7]  Ivan Marsic,et al.  A framework for rapid development of multimodal interfaces , 2003, ICMI '03.

[8]  Sharon L. Oviatt,et al.  Unification-based Multimodal Integration , 1997, ACL.

[9]  John S. Garofolo Overcoming Barriers to Progress in Multimodal Fusion Research , 2008, AAAI Fall Symposium: Multimedia Information Extraction.

[10]  Vladimir Pavlovic,et al.  Toward multimodal human-computer interface , 1998, Proc. IEEE.

[11]  Richard A. Bolt,et al.  “Put-that-there”: Voice and gesture at the graphics interface , 1980, SIGGRAPH '80.

[12]  Rainer Stiefelhagen,et al.  Implementation and evaluation of a constraint-based multimodal fusion system for speech and 3D pointing gestures , 2004, ICMI '04.

[13]  Marie-Luce Bourguet,et al.  A Toolkit for Creating and Testing Multimodal Interface Designs , 2002 .

[14]  Sharon L. Oviatt,et al.  Ten myths of multimodal interaction , 1999, Commun. ACM.

[15]  Thierry Ganille,et al.  ICARE software components for rapidly developing multimodal interfaces , 2004, ICMI '04.

[16]  Joëlle Coutaz,et al.  A design space for multimodal systems: concurrent processing and data fusion , 1993, INTERCHI.

[17]  Denis Lalanne,et al.  Prototyping Multimodal Interfaces with the SMUIML Modeling Language , 2008 .

[18]  Trung Bui,et al.  Multimodal Dialogue Management - State of the art , 2006 .

[19]  Sharon L. Oviatt,et al.  Designing the User Interface for Multimodal Speech and Pen-Based Gesture Applications: State-of-the-Art Systems and Future Research Directions , 2000, Hum. Comput. Interact..

[20]  Sharon L. Oviatt,et al.  Multimodal Interfaces: A Survey of Principles, Models and Frameworks , 2009, Human Machine Interaction.