User Expectations from Dictation on Mobile Devices

Mobile phones, with their increasing processing power and memory, are enabling a diversity of tasks. The traditional text entry method using keypad is falling short in numerous ways. Some solutions to this problem include: QWERTY keypads on phone, external keypads, virtual keypads on table tops (Seimens at CeBIT '05) and last but not the least, automatic speech recognition (ASR) technology. Speech recognition allows for dictation which facilitates text input via voice. Despite the progress, ASR systems still do not perform satisfactorily in mobile environments. This is mainly due to the complexity of capturing large vocabulary spoken by diverse speakers in various acoustic conditions. Therefore, dictation has its advantages but also comes with its own set of usability problems. The objective of this research is to uncover the various uses and benefits of using dictation on a mobile phone. This study focused on the users' needs, expectations, and their concerns regarding the new input medium. Focus groups were conducted to investigate and discuss current data entry methods, potential use and usefulness of dictation feature, users' reaction to errors from ASR during dictation, and possible error correction methods. Our findings indicate a strong requirement for dictation. All participants perceived dictation to be very useful, as long as it is easily accessible and usable. Potential applications for dictation were found in two distinct areas namely communication and personal use.

[1]  David Wheatley,et al.  User centered research and design at Motorola , 2000, CHI Extended Abstracts.

[2]  Roope Raisamo Proceedings of the Third Nordic Conference on Human-Computer Interaction 2004, Tampere, Finland, October 23-27, 2004 , 2004, NordiCHI.

[3]  Alexander H. Waibel,et al.  Multimodal error correction for speech user interfaces , 2001, TCHI.

[4]  Elaine Toms,et al.  The effect of speech recognition accuracy rates on the usefulness and usability of webcast archives , 2006, CHI.

[5]  C Kamm,et al.  User Interfaces for voice applications , 1994 .

[6]  I. Scott MacKenzie,et al.  A comparison of two input methods for keypads on mobile devices , 2004, NordiCHI '04.

[7]  I. Scott MacKenzie,et al.  Text Entry for Mobile Computing: Models and Methods,Theory and Practice , 2002, Hum. Comput. Interact..

[8]  Alex Waibel,et al.  Multimodal interfaces for multimedia information agents , 1997 .

[9]  Rebecca E. Grinter,et al.  Y Do Tngrs Luv 2 Txt Msg? , 2001, ECSCW.

[10]  Peter Tarasewich Evaluation of thumbwheel text entry methods , 2003, CHI Extended Abstracts.

[11]  Alexander I. Rudnicky,et al.  Survey of current speech technology , 1994, CACM.

[12]  Mark D. Dunlop,et al.  Predictive text entry methods for mobile phones , 2000, Personal Technologies.

[13]  Allen Newell,et al.  The psychology of human-computer interaction , 1983 .

[14]  I. Scott MacKenzie,et al.  Predicting text entry speed on mobile phones , 2000, CHI.

[15]  Daniel B. Horn,et al.  Patterns of entry and correction in large vocabulary continuous speech recognition systems , 1999, CHI '99.

[16]  Anna L. Cox,et al.  Evaluating the viability of speech recognition for mobile text entry , 2004 .

[17]  Clare-Marie Karat,et al.  Productivity, satisfaction, and interaction strategies of individuals with spinal cord injuries and traditional users interacting with speech recognition software , 2001, Universal Access in the Information Society.

[18]  Clare-Marie Karat,et al.  How productivity improves in hands-free continuous dictation tasks: lessons learned from a longitudinal study , 2005, Interact. Comput..

[19]  Sharon L. Oviatt,et al.  Designing the User Interface for Multimodal Speech and Pen-Based Gesture Applications: State-of-the-Art Systems and Future Research Directions , 2000, Hum. Comput. Interact..

[20]  R G Leiser Improving natural language and speech interfaces by the use of metalinguistic phenomena. , 1989, Applied ergonomics.