论文信息 - Using Prosodic Features of Speech and Audio Localization in Graphical User Interfaces

Using Prosodic Features of Speech and Audio Localization in Graphical User Interfaces

We describe several approaches for using prosodic features of speech and audio localization to control interactive applications. This information can be used for parameter control, as well as for disambiguating speech recognition. We discuss how characteristics of the spoken sentences can be exploited in the user interface; for example, by considering the speed with which the sentence was spoken and the presence of extraneous utterances. We also show how coarse audio localization can be used for low-fidelity gesture tracking, by inferring the speaker's head position.

Alex Olwal | Steven Feiner

[1] Julia Hirschberg,et al. Experiments in Emotional Speech , 2003 .

[2] Andrew Sears,et al. Speech-based cursor control , 2002, ASSETS.

[3] S. Feiner,et al. Unit — A modular framework for interaction technique design , development and implementation , 2002 .

[4] Takeo Igarashi,et al. Voice as sound: using non-verbal voice input for interactive control , 2001, UIST '01.

[5] Nigel Ward,et al. Responding to subtle, fleeting changes in the user's internal state , 2001, CHI.