Onset asynchrony in spoken menus

The menu is an important interface component, which appears unlikely to be completely superseded by modern search-based approaches. For someone who is unable to attend a screen visually, however, alternative non-visual menu formats are often problematic. A display is developed in which multiple concurrent words are presented with different amounts of onset asynchrony. The effect of different amounts of asynchrony and word length on task durations, accuracy and workload are explored. It is found that total task duration is significantly affected by both onset asynchrony and word duration. Error rates are significantly affected by both onset asynchrony, word length and their interaction, whilst subjective workload scores are only significantly affected by onset asynchrony. Overall, the results appear to suggest that the best compromise between accuracy, workload and speed may be achieved through presenting shorter or temporally-compressed words with a short inter-stimuli interval.

[1]  J. C. Webster,et al.  Responding to Both of Two Overlapping Messages , 1954 .

[2]  Mark A. Ericson,et al.  Factors That Influence Intelligibility in Multitalker Speech Displays , 2004 .

[3]  William W. Gaver Auditory Icons: Using Sound in Computer Interfaces , 1986, Hum. Comput. Interact..

[4]  Brian Gygi,et al.  Perceiving the speech of multiple concurrent talkers in a combined divided and selective attention task. , 2007, The Journal of the Acoustical Society of America.

[5]  Michitaka Hirose,et al.  vCocktail: Multiplexed-voice Menu Presentation Method for Wearable Computers , 2006, IEEE Virtual Reality Conference (VR 2006).

[6]  Jaka Sodnik,et al.  Multiple spatial sounds in hierarchical menu navigation for visually impaired computer users , 2011, Int. J. Hum. Comput. Stud..

[7]  Meera Blattner,et al.  Earcons and Icons: Their Structure and Common Design Principles , 1989, Hum. Comput. Interact..

[8]  Jae Hee Lee,et al.  Effect of fundamental-frequency and sentence-onset differences on speech-identification performance of young and older adults in a competing-talker background. , 2012, The Journal of the Acoustical Society of America.

[9]  Brian R Glasberg,et al.  Derivation of auditory filter shapes from notched-noise data , 1990, Hearing Research.

[10]  Richard L. McKinley,et al.  Spatial Audio Displays for Speech Communications: A Comparison of Free Field and Virtual Acoustic Environments , 1999 .

[11]  David Elsweiler,et al.  Understanding casual-leisure information needs: a diary study in the context of television viewing , 2010, IIiX.

[12]  K. Chint,et al.  NASA TLX: Software for assessing subjective mental workload , 2009 .

[13]  K. D. Kryter,et al.  ARTICULATION-TESTING METHODS: CONSONANTAL DIFFERENTIATION WITH A CLOSED-RESPONSE SET. , 1965, The Journal of the Acoustical Society of America.

[14]  João Guerreiro,et al.  Text-to-speeches: evaluating the perception of concurrent speech by blind people , 2014, ASSETS.

[15]  I. V. Ramakrishnan,et al.  More than meets the eye: a survey of screen-reader browsing strategies , 2010, W4A.

[16]  K Y Liang,et al.  Longitudinal data analysis for discrete and continuous outcomes. , 1986, Biometrics.

[17]  Christopher Frauenberger,et al.  Patterns in auditory menu design , 2006 .

[18]  D. Massaro Perceptual units in speech recognition. , 1974, Journal of experimental psychology.

[19]  Gary Bishop,et al.  Clique: Perceptually Based, Task Oriented Auditory Display for GUI Applications , 2008 .

[20]  Myounghoon Jeon,et al.  Ergonomics Society of the Human Factors and Human Factors: The Journal , 2012 .

[21]  Albert S. Bregman,et al.  The Auditory Scene. (Book Reviews: Auditory Scene Analysis. The Perceptual Organization of Sound.) , 1990 .

[22]  C. Darwin,et al.  Effects of fundamental frequency and vocal-tract length changes on attention to one of two simultaneous talkers. , 2003, The Journal of the Acoustical Society of America.