Exploring the use of crowdsourcing to support empirical studies in software engineering

The power and the generality of the findings obtained through empirical studies are bounded by the number and type of participating subjects. In software engineering, obtaining a large number of adequate subjects to evaluate a technique or tool is often a major challenge. In this work we explore the use of crowdsourcing as a mechanism to address that challenge by assisting in subject recruitment. More specifically, through this work we show how we adapted a study to be performed under an infrastructure that not only makes it possible to reach a large base of users but it also provides capabilities to manage those users as the study is being conducted. We discuss the lessons we learned through this experience, which illustrate the potential and tradeoffs of crowdsourcing software engineering studies.

[1]  Aniket Kittur,et al.  Crowdsourcing user studies with Mechanical Turk , 2008, CHI.

[2]  Martin Fowler,et al.  Refactoring - Improving the Design of Existing Code , 1999, Addison Wesley object technology series.

[3]  Brendan T. O'Connor,et al.  Cheap and Fast – But is it Good? Evaluating Non-Expert Annotations for Natural Language Tasks , 2008, EMNLP.

[4]  Kathryn T. Stolee Analysis and Transformation of Pipe-like Web Mashups for End User Programmers , 2010 .

[5]  Elizabeth F. Churchill,et al.  Conversations in developer communities: a preliminary analysis of the yahoo! pipes community , 2009, C&T.

[6]  Mary Shaw,et al.  Estimating the numbers of end users and end user programmers , 2005, 2005 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC'05).

[7]  Bill Tomlinson,et al.  Who are the crowdworkers?: shifting demographics in mechanical turk , 2010, CHI Extended Abstracts.

[8]  Ed H. Chi,et al.  Towards a model of understanding social search , 2008, SSM '08.

[9]  Duncan J. Watts,et al.  Financial incentives and the "performance of crowds" , 2009, HCOMP '09.

[10]  Lydia B. Chilton,et al.  TurKit: Tools for iterative tasks on mechanical turk , 2009, 2009 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC).

[11]  Jeffrey Heer,et al.  Crowdsourcing graphical perception: using mechanical turk to assess visualization design , 2010, CHI.

[12]  Lorrie Faith Cranor,et al.  Are your participants gaming the system?: screening mechanical turk workers , 2010, CHI.