Vocal Shortcuts for Creative Experts

Vocal shortcuts, short spoken phrases to control interfaces, have the potential to reduce cognitive and physical costs of interactions. They may benefit expert users of creative applications (e.g., designers, illustrators) by helping them maintain creative focus. To aid the design of vocal shortcuts and gather use cases and design guidelines for speech interaction, we interviewed ten creative experts. Based on our findings, we built VoiceCuts, a prototype implementation of vocal shortcuts in the context of an existing creative application. In contrast to other speech interfaces, VoiceCuts targets experts' unique needs by handling short and partial commands and leverages document model and application context to disambiguate user utterances. We report on the viability and limitations of our approach based on feedback from creative experts.

[1]  Pierre Dragicevic,et al.  Strategies for accelerating on-line learning of hotkeys , 2007, CHI.

[2]  Katsuhiko Shirai,et al.  improving human interface drawing tool using speech, mouse and key-board , 1995, Proceedings 4th IEEE International Workshop on Robot and Human Communication.

[3]  Dimitris I. Rigas,et al.  How effective is it to design by voice? , 2007, BCS HCI.

[4]  Daniel J. Wigdor,et al.  Métamorphe: augmenting hotkey usage with actuated keys , 2013, CHI.

[5]  Philip R. Cohen,et al.  Synergistic use of direct manipulation and natural language , 1989, CHI '89.

[6]  Dimitris I. Rigas,et al.  How effective is it to design by voice , 2007 .

[7]  Paul van Schaik,et al.  Modelling user experience - An agenda for research and practice , 2010, Interact. Comput..

[8]  Carl Gutwin,et al.  Improving command selection with CommandMaps , 2012, CHI.

[9]  Joachim Meyer,et al.  Benefits and costs of adaptive user interfaces , 2010, Int. J. Hum. Comput. Stud..

[10]  Amos Azaria,et al.  SUGILITE: Creating Multimodal Smartphone Automation by Demonstration , 2017, CHI.

[11]  Vidya Setlur,et al.  Eviza: A Natural Language Interface for Visual Analysis , 2016, UIST.

[12]  Robert B. Allen,et al.  Mental Models and User Models , 1997 .

[13]  R. Pausch An Empirical Study : Adding Voice Input to a Graphical Editor , 1991 .

[14]  Kathleen K. Molnar,et al.  The impacts on user performance and satisfaction of a voice-based front-end interface for a standard software tool , 1996, Int. J. Hum. Comput. Stud..

[15]  Kenton O'Hara,et al.  Voice or Gesture in the Operating Room , 2015, CHI Extended Abstracts.

[16]  Gilles Bailly,et al.  IconHK: Using Toolbar button Icons to Communicate Keyboard Shortcuts , 2017, CHI.

[17]  Daniel Vogel,et al.  Finger-Aware Shortcuts , 2016, CHI.

[18]  Astrid Weber,et al.  What can I say?: addressing user experience challenges of a mobile voice user interface for accessibility , 2016, MobileHCI.

[19]  Carl Gutwin,et al.  Supporting Novice to Expert Transitions in User Interfaces , 2014, ACM Comput. Surv..

[20]  Donald A. Norman,et al.  User Centered System Design: New Perspectives on Human-Computer Interaction , 1988 .

[21]  Juan Pablo Wachs,et al.  Gestonurse: A multimodal robotic scrub nurse , 2012, 2012 7th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[22]  M. Csíkszentmihályi Flow: The Psychology of Optimal Experience , 1990 .

[23]  R. Young Surrogates and mappings: two kinds of conceptual models for interactive , 1983 .

[24]  James A. Landay,et al.  Voicedraw: a hands-free voice-driven drawing application for people with motor impairments , 2007, Assets '07.

[25]  Anthony Jameson,et al.  Systems That Adapt to Their Users , 2012 .

[26]  Jaime Teevan,et al.  Information re-retrieval: repeat queries in Yahoo's logs , 2007, SIGIR.

[27]  Richard A. Bolt,et al.  “Put-that-there”: Voice and gesture at the graphics interface , 1980, SIGGRAPH '80.

[28]  Kristina Höök,et al.  Steps to take before intelligent user interfaces become real , 2000, Interact. Comput..

[29]  Holger Winnemöller,et al.  DiscoverySpace: Suggesting Actions in Complex Software , 2016, Conference on Designing Interactive Systems.

[30]  Dermot P. Browne,et al.  A self-regulating adaptive system , 1987, CHI '87.

[31]  Ben Shneiderman,et al.  The limits of speech recognition , 2000, CACM.

[32]  P R Cohen,et al.  The role of voice input for human-machine communication. , 1995, Proceedings of the National Academy of Sciences of the United States of America.

[33]  Chris Baber,et al.  Using critical path analysis to model multimodal human-computer interaction , 2001, Int. J. Hum. Comput. Stud..

[34]  Eric Horvitz,et al.  Principles of mixed-initiative user interfaces , 1999, CHI '99.

[35]  Rob Miller,et al.  Sikuli: using GUI screenshots for search and automation , 2009, UIST '09.

[36]  Krzysztof Z. Gajos,et al.  Design Space and Evaluation Challenges of Adaptive Graphical User Interfaces , 2009, AI Mag..

[37]  Andrew Begel,et al.  An Assessment of a Speech-Based Programming Environment , 2006, Visual Languages and Human-Centric Computing (VL/HCC'06).

[38]  Jan Gulliksen,et al.  User-centered System Design , 2011 .

[39]  Uran Oh,et al.  The challenges and potential of end-user gesture customization , 2013, CHI.

[40]  Yiwen Sun,et al.  Articulate: A Semi-automated Model for Translating Natural Language Queries into Meaningful Visualizations , 2010, Smart Graphics.

[41]  Edward Cutrell,et al.  Bimanual Interaction on the Microsoft Office Keyboard , 2003, INTERACT.

[42]  John S. Gero,et al.  The structure of concurrent cognitive actions: a case study on novice and expert designers , 2002 .

[43]  Sharon L. Oviatt,et al.  Ten myths of multimodal interaction , 1999, Commun. ACM.

[44]  Andrew Sears,et al.  Speech-based cursor control: understanding the effects of target size, cursor speed, and command selection , 2002, Universal Access in the Information Society.

[45]  Krzysztof Z. Gajos,et al.  Predictability and accuracy in adaptive user interfaces , 2008, CHI.

[46]  Hilary Johnson,et al.  Supporting creative work tasks: the potential of multimodal tools to support sketching , 1999, Creativity & Cognition.

[47]  R. Pea User Centered System Design: New Perspectives on Human-Computer Interaction , 1987 .

[48]  Gierad Laput,et al.  PixelTone: a multimodal interface for image editing , 2013, CHI.

[49]  Wendy E. Mackay,et al.  Triggers and barriers to customizing software , 1991, CHI.

[50]  Krzysztof Z. Gajos,et al.  Systems That Adapt to Their Users , 2012 .

[51]  Karrie Karahalios,et al.  DataTone: Managing Ambiguity in Natural Language Interfaces for Data Visualization , 2015, UIST.

[52]  Rebecca E. Grinter,et al.  A Multi-Modal Natural Language Interface to an Information Visualization Environment , 2001, Int. J. Speech Technol..

[53]  Anna Kochan,et al.  Scalpel please, robot: Penelope's debut in the operating room , 2005, Ind. Robot.

[54]  Carl Gutwin,et al.  Skillometers: reflective widgets that motivate and help users to improve performance , 2013, UIST.

[55]  Tovi Grossman,et al.  CommunityCommands: command recommendations for software applications , 2009, UIST '09.

[56]  George W. Fitzmaurice,et al.  The Hotbox: efficient access to a large number of menu-items , 1999, CHI '99.

[57]  Fadi Biadsy,et al.  JustSpeak: enabling universal voice control on Android , 2014, W4A.

[58]  Ben Shneiderman,et al.  Speech versus Mouse Commands for Word Processing: An Empirical Evaluation , 1993, Int. J. Man Mach. Stud..

[59]  Victor Zue,et al.  On the design of effective speech-based interfaces for desktop applications , 1997, EUROSPEECH.

[60]  Martin H. Levinson Creativity: Flow and the Psychology of Discovery and Invention , 1997 .

[61]  William Buxton,et al.  The limits of expert performance using hierarchic marking menus , 1993, INTERCHI.

[62]  Randy Pausch,et al.  Voice Input vs. Keyboard Accelerators: A User Study , 2012 .

[63]  Julie L. Baher,et al.  The usability of creativity: experts v. novices , 2009, C&C '09.

[64]  Timothy Brittain-Catlin Put it there , 2013 .

[65]  Cristina Conati,et al.  Supporting interface customization using a mixed-initiative approach , 2007, IUI '07.