When do we interact multimodally?: cognitive load and multimodal communication patterns

Mobile usage patterns often entail high and fluctuating levels of difficulty as well as dual tasking. One major theme explored in this research is whether a flexible multimodal interface supports users in managing cognitive load. Findings from this study reveal that multimodal interface users spontaneously respond to dynamic changes in their own cognitive load by shifting to multimodal communication as load increases with task difficulty and communicative complexity. Given a flexible multimodal interface, users' ratio of multimodal (versus unimodal) interaction increased substantially from 18.6% when referring to established dialogue context to 77.1% when required to establish a new context, a +315% relative increase. Likewise, the ratio of users' multimodal interaction increased significantly as the tasks became more difficult, from 59.2% during low difficulty tasks, to 65.5% at moderate difficulty, 68.2% at high and 75.0% at very high difficulty, an overall relative increase of +27%. Analysis of users' task-critical errors and response latencies across task difficulty levels increased systematically and significantly as well, corroborating the manipulation of cognitive processing load. The adaptations seen in this study reflect users' efforts to self-manage limitations on working memory when task complexity increases. This is accomplished by distributing communicative information across multiple modalities, which is compatible with a cognitive load theory of multimodal interaction. The long-term goal of this research is the development of an empirical foundation for proactively guiding flexible and adaptive multimodal system design.

[1]  J. Sweller,et al.  Reducing cognitive load by mixing auditory and visual presentation modes , 1995 .

[2]  Sharon L. Oviatt,et al.  Designing the User Interface for Multimodal Speech and Pen-Based Gesture Applications: State-of-the-Art Systems and Future Research Directions , 2000, Hum. Comput. Interact..

[3]  Eric Horvitz,et al.  Foreground and background interaction with sensor-enhanced mobile devices , 2005, TCHI.

[4]  HorvitzEric,et al.  Foreground and background interaction with sensor-enhanced mobile devices , 2005 .

[5]  Julie A. Jacko,et al.  Isolating the effects of visual impairment: exploring the effect of AMD on the utility of multimodal feedback , 2004, CHI '04.

[6]  Sharon L. Oviatt,et al.  Modeling multimodal integration patterns and performance in seniors: toward adaptive processing of individual differences , 2003, ICMI '03.

[7]  A. Almor,et al.  Noun-phrase anaphors and focus: the informational load hypothesis. , 1999, Psychological review.

[8]  C. Penney Modality effects and the structure of short-term verbal memory , 1989, Memory & cognition.

[9]  C D Wickens,et al.  Compatibility and Resource Competition between Modalities of Input, Central Processing, and Output , 1983, Human factors.

[10]  Sharon L. Oviatt,et al.  Mutual disambiguation of recognition errors in a multimodel architecture , 1999, CHI '99.

[11]  P. Chandler,et al.  Cognitive Load Theory and the Format of Instruction , 1991 .

[12]  Ellen F. Prince,et al.  Toward a taxonomy of given-new information , 1981 .

[13]  Sharon L. Oviatt,et al.  Predicting spoken disfluencies during human-computer interaction , 1995, Comput. Speech Lang..

[14]  Sharon Oviatt,et al.  Multimodal interactive maps: designing for human performance , 1997 .

[15]  Christian A. Müller,et al.  Recognizing Time Pressure and Cognitive Load on the Basis of Speech: An Experimental Study , 2001, User Modeling.

[16]  Catherine Pelachaud,et al.  Audio-visual and multimodal speech-based systems , 2000 .

[17]  Sharon K Tindall-Ford,et al.  When two sensory modes are better than one , 1997 .

[18]  Sharon L. Oviatt,et al.  Ten myths of multimodal interaction , 1999, Commun. ACM.

[19]  S A Hillyard,et al.  An analysis of audio-visual crossmodal integration by means of event-related potential (ERP) recordings. , 2002, Brain research. Cognitive brain research.

[20]  R. Mayer,et al.  A Split-Attention Effect in Multimedia Learning: Evidence for Dual Processing Systems in Working Memory , 1998 .

[21]  Sharon L. Oviatt,et al.  Multimodal system processing in mobile environments , 2000, UIST '00.

[22]  John Sweller,et al.  Cognitive Load During Problem Solving: Effects on Learning , 1988, Cogn. Sci..

[23]  Trevor Darrell,et al.  MULTIMODAL INTERFACES THAT Flex, Adapt, and Persist , 2004 .

[24]  Sharon L. Oviatt,et al.  Toward a theory of organized multimodal integration patterns during human-computer interaction , 2003, ICMI '03.

[25]  Dafydd Gibbon,et al.  Handbook of Multimodal and Spoken Dialogue Systems: Resources, Terminology and Product Evaluation , 2000, Computational Linguistics.

[26]  A. Jameson Adaptive interfaces and agents , 2002 .

[27]  C. Spence,et al.  The Handbook of Multisensory Processing , 2004 .

[28]  H. Grice Logic and conversation , 1975 .

[29]  Steven Greenberg,et al.  Speech intelligibility derived from asynchronous processing of auditory-visual information , 2001, AVSP.