A comparison of speech and GUI input for navigation in complex visualizations on mobile devices

Mobile devices are ubiquitously used to access web applications. Multimodal mobile interfaces can offer advantages over less flexible approaches, in both usability and range of features. In this study we consider applying speech input to a web-based network management service. The key issue we are interested in is how to perform suitable multidimensional search through web-based interface on mobile devices. We present results from a pilot user evaluation, focusing on the comparison of a novel speech input method with the existing manual (GUI, Graphical User Interface) input for AT&T's Visualizer management service, on an iPhone. Speech input was experimentally shown to be as effective, more efficient, and preferred over GUI input by most users. We foresee that a multimodal approach may be preferable for many applications on mobile devices.

[1]  Eric Chang,et al.  Efficient Web Search on Mobile Devices with Multi-Modal Input and Intelligent Text Summarization , 2002 .

[2]  Kristiina Jokinen,et al.  User Expertise Modeling and Adaptivity in a Speech-Based E-Mail System , 2004, ACL.

[3]  Markku Turunen,et al.  Evaluation of predictive text and speech inputs in a multimodal mobile route guidance application , 2008, Mobile HCI.

[4]  Antinus Nijholt,et al.  Speech and Language Interaction in a Web Theatre Environment , 1999 .

[5]  Slava Kalyuga,et al.  Managing split-attention and redundancy in multimedia instruction , 1999 .

[6]  Edward James Schofield,et al.  A Speech Interface for Open-Domain Question-Answering , 2003, ACL.

[7]  Giuseppe Di Fabbrizio,et al.  A speech mashup framework for multimodal mobile services , 2009, ICMI-MLMI '09.

[8]  Jukka Riekki,et al.  Multimodal interaction with speech and physical touch interface in a media center application , 2009, Advances in Computer Entertainment Technology.

[9]  James Glass,et al.  A Multimodal Home Entertainment Interface via a Mobile Device , 2008, ACL 2008.

[10]  Michela Bertolotto,et al.  Combining speech and pen input for effective interaction in mobile geospatial environments , 2006, SAC '06.

[11]  Nuria Oliver,et al.  Text versus speech: a comparison of tagging input modalities for camera phones , 2009, Mobile HCI.

[12]  Marilyn A. Walker,et al.  MATCH: An Architecture for Multimodal Dialogue Systems , 2002, ACL.

[13]  Dilek Z. Hakkani-Tür,et al.  The AT&T WATSON speech recognizer , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..