Performance differences in a cross-cultural comparison of voice enhanced interface

Abstract The increasing power of computer hardware is enabling the user interface to enter a new dimension, where voice enhanced interfaces can remove the barrier between people and the machine for some users. In this study the traditional keyboard and mouse interface (TI) was compared with voice enhanced interface (VEI). An automatic semester grade calculation software was developed in visual basic and enhanced with a “trainable” voice recognition software for Cantonese and English. Results show that there are significant differences in task completion time, number of errors and satisfaction for the traditional keyboard interface and the voice enhanced interface. In addition, contrary to our intuition, native Cantonese speakers had more voice input errors when speaking Cantonese than English. Though voice enhanced interfaces have many potential applications, consideration should be given to the differences between English and spoken languages that are based on tones such as Cantonese. It appears that further development in voice recognition technology is required in order to make voice input widely usable for the Chinese speaking computer users. Relevance to industry Voice input may seem the ultimate input method, since hands free data entry can allow tremendous flexibility. However, the results of this study indicate that, in industry, voice input should be used with caution, especially when tonal languages or safety are involved.

[1]  Kathleen K. Molnar,et al.  The impacts on user performance and satisfaction of a voice-based front-end interface for a standard software tool , 1996, Int. J. Hum. Comput. Stud..

[2]  Philippe Coiffet,et al.  Virtual Reality Technology , 2003, Presence: Teleoperators & Virtual Environments.

[3]  A. Edwards Extra-ordinary human-computer interaction: interfaces for users with disabilities , 1995 .

[4]  Robert I. Damper,et al.  Speech versus keying in command and control: effect of concurrent tasking , 1996, Int. J. Hum. Comput. Stud..

[5]  Frank Biocca,et al.  Virtual Reality Technology: A Tutorial , 1992 .

[6]  K. A. Ericsson,et al.  Protocol Analysis: Verbal Reports as Data , 1984 .

[7]  William W. Gaver The SonicFinder: An Interface That Uses Auditory Icons , 1989, Hum. Comput. Interact..

[8]  Jakob Nielsen,et al.  Estimating the number of subjects needed for a thinking aloud test , 1994, Int. J. Hum. Comput. Stud..

[9]  Andrew S. Rappaport,et al.  The Computerless Computer Company , 1992 .

[10]  Ray E. Eberts User interface design , 1994, Prentice Hall international series in industrial and systems engineering.

[11]  Alistair D. N. Edwards Computers and people with disabilities , 1995 .

[12]  Chiu-yu Tseng,et al.  An Acoustic phonetic study on Tones in Mandarin Chinese , 1981 .

[13]  Douglas J. Brems,et al.  Using Natural Language Conventions in the User Interface Design of Automatic Speech Recognition Systems , 1995, Hum. Factors.

[14]  J. C. Byers,et al.  Comparison of Four Subjective Workload Rating Scales , 1992 .

[15]  Dylan M. Jones,et al.  Voice as interface: An overview , 1991, Int. J. Hum. Comput. Interact..

[16]  Yuen-Yuen Fok Chan A perceptual study of tones in Cantonese , 1974 .

[17]  Jock D. Mackinlay,et al.  Information visualization using 3D interactive animation , 1993, CACM.

[18]  Peter F. Drucker,et al.  The new productivity challenge. , 1991 .

[19]  R. Goonetilleke,et al.  Do pen characteristics affect writing performance , 1997 .

[20]  Jakob Nielsen,et al.  Usability engineering , 1997, The Computer Science and Engineering Handbook.

[21]  Robert I. Damper,et al.  Speech versus keying in command and control applications , 1995, Int. J. Hum. Comput. Stud..

[22]  John G. Casali,et al.  A Validated Rating Scale for Global Mental Workload Measurement Applications , 1983 .