Conversational gestures for direct manipulation on the audio desktop

We describe the speech-enabling approach to building auditory interfaces that treat speech as a first-class modality. The process of designing effective auditory interfaces is decomposed into identifying the atomic actions that make up the user interaction and the conversational gestures that enable these actions. The auditory interface is then synthesized by mapping these conversational gestures to appropriate primitives in the auditory environment. We illustrate this process with a concrete example by developing an auditory interface to the visually intensive task of playing tetris. Playing Tetris is a fun activity that has many of the same demands as day-to-day activities on the electronic desktop. Speech-enabling Tetris thus not only provides a fun way to exercise ones geometric reasoning abilities - it provides useful lessons in speech-enabling commonplace computing tasks.

[1]  David Gries,et al.  Interactive audio documents , 1994, ASSETS.

[2]  Meera Blattner,et al.  Earcons and Icons: Their Structure and Common Design Principles , 1989, Hum. Comput. Interact..

[3]  Meera M. Blattner,et al.  Earcons and Icons: Their Structure and Common Design Principles (Abstract only) , 1989, SGCH.

[4]  P. Libby The Scientific American , 1881, Nature.

[5]  T. V. Raman,et al.  Congrats: a system for converting graphics to sound , 1992, Proceedings of the Johns Hopkins National Search for Computing Applications to Assist Persons with Disabilities.

[6]  B. Hayes The American Scientist , 1962, Nature.

[7]  T. V. Raman,et al.  Emacspeak—direct speech access , 1996, Assets '96.

[8]  Brian Hayes,et al.  SPEAKING OF MATHEMATICS , 1996 .

[9]  T. V. Raman,et al.  Audio System for Technical Readings , 1998, Lecture Notes in Computer Science.

[10]  T. V. Raman Auditory User Interfaces: Toward the Speaking Computer , 1997 .

[11]  T. V. Raman,et al.  Emacspeak—a speech interface , 1996, CHI.