A design space for multimodal systems: concurrent processing and data fusion

Multimodal interaction enables the user to employ different modalities such as voice, gesture and typing for communicating with a computer. This paper presents an analysis of the integration of multiple communication modalities within an interactive system. To do so, a software engineering perspective is adopted. First, the notion of “multimodal system” is clarified. We aim at proving that two main features of a multimodal system are the concurrency of processing and the fusion of input/output data. On the basis of these two features, we then propose a design space and a method for classifying multimodal systems. In the last section, we present a software architecture model of multimodal systems which supports these two salient properties: concurrency of processing and data fusion. Two multimodal systems developed in our team, VoicePaint and NoteBook, are used to illustrate the discussion.

[1]  Joelle Coutaz,et al.  Multimedia and Multimodal User Interfaces: A Taxonomy for Software Engineering Research Issues , 1992 .

[2]  Roger B. Dannenberg,et al.  CHI'90 workshop on multimedia and multimodal interface design , 1990, SGCH.

[3]  David Frohlich,et al.  The Design Space of Interfaces , 1992 .

[4]  Jock D. Mackinlay,et al.  A morphological analysis of the design space of input devices , 1991, TOIS.

[5]  Joëlle Coutaz,et al.  Two Case Studies of Software Architecture for Multimodal Interactive Systems: VoicePaint and a Voice-enabled Graphical Notebook , 1992, Engineering for Human-Computer Interaction.

[6]  Ramesh Govindan,et al.  Abstractions for continuous media in a network window system , 1991 .

[7]  Bruce F. Webster,et al.  The NeXT Book , 1989 .

[8]  Daniel T. Ling,et al.  Dialogue structures for virtual worlds , 1991, CHI.

[9]  M. D. Wilson First MMI(2) demonstrator: A multi-modal interface for man machine interaction with knowledge based systems , 1992 .

[10]  Alexander I. Rudnicky,et al.  Spoken language interfaces: the OM system , 1991, CHI.

[11]  Ralph D. Hill,et al.  Supporting concurrency, communication, and synchronization in human-computer interaction—the Sassafras UIMS , 1986, TOGS.

[12]  Alexander G. Hauptmann,et al.  Speech and gestures for graphic image manipulation , 1989, CHI '89.

[13]  James D. Foley,et al.  The human factors of computer graphics interaction techniques , 1984, IEEE Computer Graphics and Applications.

[14]  Arja Vainio-Larsson,et al.  Evaluating the usability of user interfaces: Research in practice , 1990, INTERACT.

[15]  L. Kjelldahl Multimedia : systems, interaction and applications , 1992 .

[16]  Robert E. Kraut,et al.  Expressive richness: a comparison of speech and text as media for revision , 1991, CHI.

[17]  William Buxton,et al.  Lexical and pragmatic considerations of input structures , 1983, COMG.