Crowdsourcing Similarity Judgments for Agreement Analysis in End-User Elicitation Studies

End-user elicitation studies are a popular design method, but their data require substantial time and effort to analyze. In this paper, we present Crowdsensus, a crowd-powered tool that enables researchers to efficiently analyze the results of elicitation studies using subjective human judgment and automatic clustering algorithms. In addition to our own analysis, we asked six expert researchers with experience running and analyzing elicitation studies to analyze an end-user elicitation dataset of 10 functions for operating a web-browser, each with 43 voice commands elicited from end-users for a total of 430 voice commands. We used Crowdsensus to gather similarity judgments of these same 430 commands from 410 online crowd workers. The crowd outperformed the experts by arriving at the same results for seven of eight functions and resolving a function where the experts failed to agree. Also, using Crowdsensus was about four times faster than using experts.

[1]  Hao Li,et al.  User Defined Eye Movement-Based Interaction for Virtual Reality , 2018, HCI.

[2]  Aditya G. Parameswaran,et al.  So who won?: dynamic max discovery with the crowd , 2012, SIGMOD Conference.

[3]  S. Holm A Simple Sequentially Rejective Multiple Test Procedure , 1979 .

[4]  Nir Ailon,et al.  Aggregating inconsistent information: Ranking and clustering , 2008 .

[5]  Richard E. Ladner,et al.  Usable gestures for blind people: understanding preference and performance , 2011, CHI.

[6]  S. SilbermanM.,et al.  Responsible research with crowds , 2018 .

[7]  Anthony Wirth,et al.  Correlation Clustering , 2010, Encyclopedia of Machine Learning and Data Mining.

[8]  Lydia B. Chilton,et al.  Cascade: crowdsourcing taxonomy creation , 2013, CHI.

[9]  S. Geisser,et al.  On methods in the analysis of profile data , 1959 .

[10]  Meredith Ringel Morris,et al.  Understanding users' preferences for surface gestures , 2010, Graphics Interface.

[11]  Bill Tomlinson,et al.  Responsible research with crowds , 2018, Commun. ACM.

[12]  Mausam,et al.  Crowdsourcing Multi-Label Classification for Taxonomy Creation , 2013, HCOMP.

[13]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[14]  Michael S. Bernstein,et al.  Soylent: a word processor with a crowd inside , 2010, UIST.

[15]  Zhenya Zhang,et al.  Correlation clustering based on genetic algorithm for documents clustering , 2008, 2008 IEEE Congress on Evolutionary Computation (IEEE World Congress on Computational Intelligence).

[16]  C. Lawrence Zitnick,et al.  CIDEr: Consensus-based image description evaluation , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Steven Dow,et al.  Improving Crowd Innovation with Expert Facilitation , 2016, CSCW.

[18]  Nir Ailon,et al.  Deep Metric Learning Using Triplet Network , 2014, SIMBAD.

[19]  Bongshin Lee,et al.  Reducing legacy bias in gesture elicitation studies , 2014, INTR.

[20]  Amos Fiat,et al.  Correlation clustering in general weighted graphs , 2006, Theor. Comput. Sci..

[21]  Meredith Ringel Morris,et al.  Web on the wall: insights from a multimodal interaction elicitation study , 2012, ITS.

[22]  Radu-Daniel Vatavu,et al.  Formalizing Agreement Analysis for Elicitation Studies: New Measures, Significance Test, and Toolkit , 2015, CHI.

[23]  Meredith Ringel Morris,et al.  User-defined gestures for surface computing , 2009, CHI.

[24]  Adam Tauman Kalai,et al.  Adaptively Learning the Crowd Kernel , 2011, ICML.

[25]  Mohammad Obaid,et al.  User-Defined Body Gestures for Navigational Control of a Humanoid Robot , 2012, ICSR.

[26]  Aniket Kittur,et al.  CrowdForge: crowdsourcing complex work , 2011, UIST.

[27]  David Ott,et al.  Kinect analysis: a system for recording, analysing and sharing multimodal interaction elicitation studies , 2015, EICS.

[28]  Radu-Daniel Vatavu,et al.  User-defined gestures for free-hand TV control , 2012, EuroITV.

[29]  Jacob O. Wobbrock,et al.  Beyond QWERTY: augmenting touch screen keyboards with multi-touch gestures for non-alphanumeric input , 2012, CHI.

[30]  Andy Cockburn,et al.  User-defined gestures for augmented reality , 2013, INTERACT.

[31]  Dennis R. Wixon,et al.  Building a user-derived interface , 1984, CACM.

[32]  Radu-Daniel Vatavu,et al.  Between-Subjects Elicitation Studies: Formalization and Tool Support , 2016, CHI.

[33]  Sebastian Möller,et al.  I'm home: Defining and evaluating a gesture set for smart-home control , 2011, Int. J. Hum. Comput. Stud..

[34]  Anne Marie Piper,et al.  Exploring the accessibility and appeal of surface computing for older adult health care support , 2010, CHI.

[35]  Maximilian Speicher,et al.  GestureWiz: A Human-Powered Gesture Design Environment for User Interface Prototypes , 2018, CHI.

[36]  Brad A. Myers,et al.  Maximizing the guessability of symbolic input , 2005, CHI Extended Abstracts.