Symmetric Multimodal Interaction in a Dynamic Dialogue

Two important themes in current work on interfaces are multimodal interaction and the use of dialogue. Human multimodal dialogues are symmetric, i.e., both participants communicate multimodally. We describe a proof of concept system that supports symmetric multimodal communication for speech and sketching in the domain of simple mechanical device design. We discuss three major aspects of the communication: multimodal input processing, multimodal output generation, and creating a dynamic dialogue. While previous systems have had some of these capabilities individually, their combination appears to be unique. We provide examples from our system that illustrate a variety of user inputs and system outputs. Author Keywords multimodal, dynamic dialogue, sketch recognition, sketch generation, speech

[1]  Aaron Adler Segmentation and Alignment of Speech and Sketching in a Design Environment , 2003 .

[2]  Sharon L. Oviatt,et al.  Designing the User Interface for Multimodal Speech and Pen-Based Gesture Applications: State-of-the-Art Systems and Future Research Directions , 2000, Hum. Comput. Interact..

[3]  Mark Steedman,et al.  Animated conversation: rule-based generation of facial expression, gesture & spoken intonation for multiple conversational agents , 1994, SIGGRAPH.

[4]  Randall Davis,et al.  LADDER, a sketching language for user interface developers , 2005, Comput. Graph..

[5]  Trevor Darrell,et al.  Untethered gesture acquisition and recognition for virtual world manipulation , 2005, Virtual Reality.

[6]  Thomas F. Stahovich,et al.  Sketch based interfaces: early processing for sketch understanding , 2001, PUI '01.

[7]  Randall Davis,et al.  Naturally conveyed explanations of device behavior , 2001, PUI '01.

[8]  Randall Davis,et al.  Recognition of Hand Drawn Chemical Diagrams , 2007, AAAI.

[9]  Steven K. Feiner,et al.  Negotiation for automated generation of temporal multimedia presentations , 1997, MULTIMEDIA '96.

[10]  Edward C. Kaiser,et al.  Multimodal new vocabulary recognition through speech and handwriting in a whiteboard scheduling application , 2005, IUI.

[11]  Antonella De Angeli,et al.  Integration and synchronization of input modes during multimodal human-computer interaction , 1997, CHI.

[12]  Edward C. Kaiser,et al.  Using redundant speech and handwriting for learning new vocabulary and understanding abbreviations , 2006, ICMI '06.

[13]  Mark T. Maybury,et al.  Intelligent multimedia interfaces , 1994, CHI Conference Companion.

[14]  Randall Davis,et al.  Speech and sketching: an empirical study of multimodal interaction , 2007, SBIM '07.

[15]  Rong Jin,et al.  Linguistic theories in efficient multimodal reference resolution: an empirical investigation , 2005, IUI.

[16]  Michelle X. Zhou,et al.  A probabilistic approach to reference resolution in multimodal user interfaces , 2004, IUI '04.

[17]  Randall Davis,et al.  Speech and sketching for multimodal design , 2004, IUI '04.

[18]  Christine Alvarado,et al.  Dynamically constructed Bayes nets for multi-domain sketch understanding , 2005, IJCAI.

[19]  Mary Ellen Foster,et al.  Assessing the Impact of Adaptive Generation in the COMIC Multimodal Dialogue System , 2005 .

[20]  Candace L. Sidner,et al.  COLLAGEN: A Collaboration Manager for Software Interface Agents , 1998, User Modeling and User-Adapted Interaction.

[21]  John A. Bateman,et al.  Towards Constructive Text, Diagram, and Layout Generation for Information Presentation , 2001, Computational Linguistics.

[22]  Richard A. Bolt,et al.  “Put-that-there”: Voice and gesture at the graphics interface , 1980, SIGGRAPH '80.

[23]  Kenneth D. Forbus,et al.  Towards a computational model of sketching , 2001, IUI '01.

[24]  Marilyn A. Walker,et al.  MATCH: An Architecture for Multimodal Dialogue Systems , 2002, ACL.