Speech-based cursor control using grids: modelling performance and comparisons with other solutions

Speech recognition can be a powerful tool for use in human – computer interaction, especially in situations where the user's hands are unavailable or otherwise engaged. Researchers have confirmed that existing mechanisms for speech-based cursor control are both slow and error prone. To address this, we evaluated two variations of a novel grid-based cursor controlled via speech recognition. One provides users with nine cursors that can be used to specify the desired location while the second, more traditional solution, provides a single cursor. Our results confirmed a speed/accuracy trade-off with a nine-cursor variant allowing for faster task completion times while the one-cursor version resulted in reduced error rates. Our solutions eliminated the effect of distance, and dramatically reduced the importance of target size as compared to previous speech-based cursor control mechanisms. The results are explored through a predictive model and comparisons with results from earlier studies.

[1]  James A. Landay,et al.  The integrated communication 2 draw (IC2D): a drawing program for the visually impaired , 1999, CHI EA '99.

[2]  Clare-Marie Karat,et al.  Hands-Free, Speech-Based Navigation During Dictation: Difficulties, Consequences, and Solutions , 2003, Hum. Comput. Interact..

[3]  P. Fitts,et al.  INFORMATION CAPACITY OF DISCRETE MOTOR RESPONSES. , 1964, Journal of experimental psychology.

[4]  Bill Z. Manaris,et al.  SUITEKeys: a speech understanding interface for the motor-control challenged , 1998, Assets '98.

[5]  Clare-Marie Karat,et al.  Productivity, satisfaction, and interaction strategies of individuals with spinal cord injuries and traditional users interacting with speech recognition software , 2001, Universal Access in the Information Society.

[6]  Andrew Sears,et al.  Speech-based cursor control , 2002, ASSETS.

[7]  James A. Landay,et al.  Sketching images eyes-free: a grid-based dynamic drawing tool for the blind , 2002, Assets '02.

[8]  Sharon L. Oviatt,et al.  Taming recognition errors with a multimodal interface , 2000, CACM.

[9]  Alexander H. Waibel,et al.  Improving recognizer acceptance through robust, natural speech repair , 1994, ICSLP.

[10]  Sharon Oviatt,et al.  Multimodal interactive maps: designing for human performance , 1997 .

[11]  I. Scott MacKenzie,et al.  Fitts' Law as a Research and Design Tool in Human-Computer Interaction , 1992, Hum. Comput. Interact..

[12]  Ron Van Buskirk,et al.  A comparison of speech and mouse/keyboard GUI navigation , 1995, CHI '95.

[13]  I.,et al.  Fitts' Law as a Research and Design Tool in Human-Computer Interaction , 1992, Hum. Comput. Interact..

[14]  P. Fitts The information capacity of the human motor system in controlling the amplitude of movement. , 1954, Journal of experimental psychology.

[15]  Alexander H. Waibel,et al.  Multimodal error correction for speech user interfaces , 2001, TCHI.

[16]  James A. Landay,et al.  A study of blind drawing practice: creating graphical information without the visual channel , 2000, Assets '00.

[17]  Ben Shneiderman,et al.  A comparison of voice controlled and mouse controlled web browsing , 2000, Assets '00.

[18]  Andrew Sears,et al.  Speech-based cursor control: understanding the effects of target size, cursor speed, and command selection , 2002, Universal Access in the Information Society.