Voice augmented manipulation: using paralinguistic information to manipulate mobile devices

We propose a technique called voice augmented manipulation (VAM) for augmenting user operations in a mobile environment. This technique augments user interactions on mobile devices, such as finger gestures and button pressing, with voice. For example, when a user makes a finger gesture on a mobile phone and voices a sound into it, the operation will continue until stops making the sound or makes another finger gesture. The VAM interface also provides a button-based interface, and the function connected to the button is augmented by voiced sounds. Two experiments verified the effectiveness of the VAM technique and showed that repeated finger gestures significantly decreased compared to current touch-input techniques, suggesting that VAM is useful in supporting user control in a mobile environment.

[1]  Takeo Igarashi,et al.  Speech pen: predictive handwriting based on ambient multimodal recognition , 2006, CHI.

[2]  Khalil Sima'an,et al.  Wired for Speech: How Voice Activates and Advances the Human-Computer Relationship , 2006, Computational Linguistics.

[3]  Yang Li Gesture search: a tool for fast mobile data access , 2010, UIST '10.

[4]  Gregory D. Abowd,et al.  Blui: low-cost localized blowable user interfaces , 2007, UIST '07.

[5]  Xiao Li,et al.  The vocal joystick:: evaluation of voice-based cursor control techniques , 2006, Assets '06.

[6]  Xing Xie,et al.  Collapse-to-zoom: viewing web pages on small screen devices by interactively removing irrelevant content , 2004, UIST '04.

[7]  Eric Lecolinet,et al.  Clutch-free panning and integrated pan-zoom control on touch-sensitive surfaces: the cyclostar approach , 2010, CHI.

[8]  Ben Shneiderman,et al.  A comparison of voice controlled and mouse controlled web browsing , 2000, Assets '00.

[9]  Shahram Izadi,et al.  SideSight: multi-"touch" interaction around small devices , 2008, UIST '08.

[10]  Michael Rohs,et al.  Semi-automatic zooming for mobile map navigation , 2010, Mobile HCI.

[11]  Clare-Marie Karat,et al.  Hands-Free, Speech-Based Navigation During Dictation: Difficulties, Consequences, and Solutions , 2003, Hum. Comput. Interact..

[12]  James A. Landay,et al.  VoicePen: augmenting pen input with simultaneous non-linguisitic vocalization , 2007, ICMI '07.

[13]  Johannes Schöning,et al.  Map navigation with mobile devices: virtual versus physical movement with and without visual context , 2007, ICMI '07.

[14]  I. Scott MacKenzie,et al.  CHANTI: predictive text entry using non-verbal vocal input , 2011, CHI.

[15]  Takeo Igarashi,et al.  Voice as sound: using non-verbal voice input for interactive control , 2001, UIST '01.

[16]  Philip R. Cohen,et al.  Intentions in Communication , 1991, CL.

[17]  Hong Zhang,et al.  An evaluation of one-handed techniques for multiple-target selection , 2009, CHI Extended Abstracts.

[18]  Jun Rekimoto,et al.  GraspZoom: zooming and scrolling control model for single-handed mobile interaction , 2009, Mobile HCI.

[19]  Ben Shneiderman,et al.  Designing the User Interface: Strategies for Effective Human-Computer Interaction , 1998 .

[20]  Steven K. Feiner,et al.  Rubbing and tapping for precise and rapid selection on touch-screen displays , 2008, CHI.

[21]  Heedong Ko,et al.  "Move the couch where?" : developing an augmented reality multimodal interface , 2006, 2006 IEEE/ACM International Symposium on Mixed and Augmented Reality.

[22]  Pierre-Yves Oudeyer,et al.  The production and recognition of emotions in speech: features and algorithms , 2003, Int. J. Hum. Comput. Stud..

[23]  Christine M. Mitchell,et al.  Multimodal User Input to Supervisory Control Systems: Voice-Augmented Keyboard , 1987, IEEE Transactions on Systems, Man, and Cybernetics.

[24]  A. Kendon Do Gestures Communicate? A Review , 1994 .

[25]  Dirk Schnelle-Walka,et al.  Speech augmented multitouch interaction patterns , 2011, EuroPLoP.

[26]  Tetsunori Kobayashi,et al.  Speech spotter: on-demand speech recognition in human-human conversation on the telephone or in face-to-face situations , 2004, INTERSPEECH.

[27]  Pavel Slavík,et al.  Whistling User Interface (U3I) , 2004, User Interfaces for All.

[28]  Andreas Butz,et al.  Interactions in the air: adding further depth to interactive tabletops , 2009, UIST '09.

[29]  Richard A. Bolt,et al.  “Put-that-there”: Voice and gesture at the graphics interface , 1980, SIGGRAPH '80.

[30]  James A. Landay,et al.  Voicedraw: a hands-free voice-driven drawing application for people with motor impairments , 2007, Assets '07.

[31]  Andrew Sears,et al.  Speech-based cursor control: a study of grid-based solutions , 2003, ASSETS.

[32]  Jeff A. Bilmes,et al.  Longitudinal study of people learning to use continuous voice-based cursor control , 2009, CHI.

[33]  Yang Li,et al.  Experimental analysis of touch-screen gesture designs in mobile environments , 2011, CHI.

[34]  Jeff A. Bilmes,et al.  The VoiceBot: a voice controlled robot arm , 2009, CHI.