Speech Versus Touch: A Comparative Study of the Use of Speech and DTMF Keypad for Navigation

This article reports on an experiment that critically tests user preference for an input modality (speech vs. Dual Tone Multiple Frequency[DTMF])in a phone-based message retrieval system. Unlike previous studies that compared these two modalities, the speech system used in this study was a fully functioning natural language system, and participants in this study were working professionals, rather than college students. Results indicate that (a) DTMF was more effective and efficient for linear tasks, whereas speech was better for nonlinear tasks; (b) speech was preferred to DTMF by a majority of users; (c) speech was judged as being more satisfying, more entertaining, and easier to use than DTMF; and (d) user preference for a particular modality was better predicted by user performance in nonlinear tasks rather than linear ones. Possible reasons for users' continuing preference for the speech modality even after experiencing fairly high recognition errors are discussed. Finally, the importance of examining speech user interfaces from other perspectives, in addition to efficiency maximization, is emphasized. The results of this study have theoretical, as well as practical, implications for the design of speech user interfaces and interactive voice response applications.

[1]  Chris Schmandt Voice communication with computers: conversational systems , 1994 .

[2]  Matthias Peissner,et al.  Voice User Interface Design , 2004, UP.

[3]  David P. Morgan,et al.  How to build a speech recognition application : a style guide for telephony dialogues , 2001 .

[4]  Joseph P. Olive The talking computer: text to speech synthesis , 2001 .

[5]  Bruce Balentine Re-Engineering the Speech Menu , 1999 .

[6]  Jakob Nielsen,et al.  Usability engineering , 1997, The Computer Science and Engineering Handbook.

[7]  Eva-Lotta Sallnäs,et al.  Navigational abilities in audial voice-controlled dialogue structures , 1999, Behav. Inf. Technol..

[8]  Dennis E. Egan,et al.  Handbook of Human Computer Interaction , 1988 .

[9]  Li Gong,et al.  Shall we mix synthetic speech and human speech?: impact on users' performance, perception, and attitude , 2001, CHI.

[10]  C. Nass,et al.  Machines and Mindlessness , 2000 .

[11]  Daryle Gardner-Bonneau,et al.  Guidelines for Speech-Enabled IVR Application Design , 1999 .

[12]  Jakob Nielsen,et al.  Chapter 4 – The Usability Engineering Lifecycle , 1993 .

[13]  Robin J. Birn,et al.  Market Research , 2002 .

[14]  Susan J. Boyce,et al.  Natural spoken dialogue systems for telephony applications , 2000, CACM.

[15]  I. A. Nairn,et al.  An experimental evaluation of preferences for data entry method in automated telephone services , 1998, Behav. Inf. Technol..

[16]  Clifford Nass,et al.  The media equation - how people treat computers, television, and new media like real people and places , 1996 .

[17]  Susan Weinschenk,et al.  Designing effective speech interfaces , 2000 .

[18]  J. P. Morgan,et al.  Design and Analysis: A Researcher's Handbook , 2005, Technometrics.

[19]  David G. Stork,et al.  Hal's Legacy: 2001's Computer as Dream and Reality , 1996 .

[20]  C. Delogu,et al.  A comparison between DTMF and ASR IVR services through objective and subjective evaluation , 1998, Proceedings 1998 IEEE 4th Workshop Interactive Voice Technology for Telecommunications Applications. IVTTA '98 (Cat. No.98TH8376).

[21]  Christopher Kotelly The art and business of speech recognition , 2003, UBIQ.

[22]  Alexander H. Waibel,et al.  Multimodal error correction for speech user interfaces , 2001, TCHI.

[23]  C. Nass,et al.  The Multiple Source Effect and Synthesized Speech: Doubly-Disembodied Language as a Conceptual Framework , 2004 .

[24]  Clifford Nass,et al.  Speech-Based Disclosure Systems: Effects of Modality, Gender of Prompt, and Gender of User , 2003, Int. J. Speech Technol..

[25]  J. F. Kelley,et al.  An iterative design methodology for user-friendly natural language office information applications , 1984, TOIS.

[26]  Nicole Yankelovich,et al.  Conversational speech interfaces , 2002 .

[27]  C. Nass,et al.  Does computer-synthesized speech manifest personality? Experimental tests of recognition, similarity-attraction, and consistency-attraction. , 2001, Journal of experimental psychology. Applied.