The online laboratory: conducting experiments in a real labor market

Online labor markets have great potential as platforms for conducting experiments. They provide immediate access to a large and diverse subject pool, and allow researchers to control the experimental context. Online experiments, we show, can be just as valid—both internally and externally—as laboratory and field experiments, while often requiring far less money and time to design and conduct. To demonstrate their value, we use an online labor market to replicate three classic experiments. The first finds quantitative agreement between levels of cooperation in a prisoner’s dilemma played online and in the physical laboratory. The second shows—consistent with behavior in the traditional laboratory—that online subjects respond to priming by altering their choices. The third demonstrates that when an identical decision is framed differently, individuals reverse their choice, thus replicating a famed Tversky-Kahneman result. Then we conduct a field experiment showing that workers have upward-sloping labor supply curves. Finally, we analyze the challenges to online experiments, proposing methods to cope with the unique threats to validity in an online setting, and examining the conceptual issues surrounding the external validity of online results. We conclude by presenting our views on the potential role that online experiments can play within the social sciences, and then recommend software development priorities and best practices.

[1]  J. Stanley Quasi-Experimentation , 1965, The School Review.

[2]  D. Rubin Estimating causal effects of treatments in randomized and nonrandomized studies. , 1974 .

[3]  W. Hamilton,et al.  The evolution of cooperation. , 1984, Science.

[4]  A. Tversky,et al.  The framing of decisions and the psychology of choice. , 1981, Science.

[5]  J. Andreoni IMPURE ALTRUISM AND DONATIONS TO PUBLIC GOODS: A THEORY OF WARM-GLOW GIVING* , 1990 .

[6]  J. Kagel,et al.  Handbook of Experimental Economics , 1997 .

[7]  J. Brandts,et al.  Hot vs. Cold: Sequential Responses and Preference Stability in Experimental Games , 1998 .

[8]  E. Fehr A Theory of Fairness, Competition and Cooperation , 1998 .

[9]  E. Fehr,et al.  Cooperation and Punishment in Public Goods Experiments , 1999, SSRN Electronic Journal.

[10]  Jordi Brandts,et al.  Hot vs. Cold: Sequential Responses and Preference Stability in Experimental Games , 2000 .

[11]  Paul Resnick,et al.  Reputation systems , 2000, CACM.

[12]  David H. Reiley Auctions on the Internet: What's Being Auctioned, and How? , 2000 .

[13]  Paul Resnick,et al.  The value of reputation on eBay: A controlled experiment , 2002 .

[14]  Colin Camerer Behavioral Game Theory: Experiments in Strategic Interaction , 2003 .

[15]  John Langford,et al.  CAPTCHA: Using Hard AI Problems for Security , 2003, EUROCRYPT.

[16]  G. Harrison,et al.  Field experiments , 1924, The Journal of Agricultural Science.

[17]  M. Kocher,et al.  The Decision Maker Matters: Individual Versus Group Behaviour in Experimental Beauty-Contest Games , 2005 .

[18]  E. Moretti,et al.  Peers at Work , 2006, SSRN Electronic Journal.

[19]  Catherine C. Eckel,et al.  Internet cautions: Experimental games with internet partners , 2006 .

[20]  W. Bainbridge The Scientific Research Potential of Virtual Worlds , 2007, Science.

[21]  A. Shariff,et al.  God Is Watching You Priming God Concepts Increases Prosocial Behavior in an Anonymous Economic Game , 2007 .

[22]  U. Fischbacher z-Tree: Zurich toolbox for ready-made economic experiments , 1999 .

[23]  Aniket Kittur,et al.  Crowdsourcing user studies with Mechanical Turk , 2008, CHI.

[24]  Steven D. Levitt,et al.  FIELD EXPERIMENTS IN ECONOMICS : THE PAST , THE PRESENT , AND THE FUTURE , 2008 .

[25]  Panagiotis G. Ipeirotis,et al.  Get another label? improving data quality and data mining using multiple, noisy labelers , 2008, KDD.

[26]  J. List,et al.  Gender Differences in Competition: Evidence from a Matrilineal and a Patriarchal Society , 2008 .

[27]  David A. Forsyth,et al.  Utility data annotation with Amazon Mechanical Turk , 2008, 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[28]  R. Zeckhauser,et al.  Betrayal Aversion: Evidence from Brazil, China, Oman, Switzerland, Turkey, and the United States , 2008 .

[29]  Benedikt Herrmann,et al.  Measuring conditional cooperation: a replication study in Russia , 2009 .

[30]  O. Bandiera,et al.  Social Connections and Incentives in the Workplace: Evidence from Personnel Data , 2009, SSRN Electronic Journal.

[31]  J. Heckman,et al.  Lab Experiments Are a Major Source of Knowledge in the Social Sciences , 2009, Science.

[32]  Duncan J. Watts,et al.  Financial incentives and the "performance of crowds" , 2009, HCOMP '09.

[33]  Lydia B. Chilton,et al.  Seaweed: a web application for designing economic games , 2009, HCOMP '09.

[34]  Lydia B. Chilton,et al.  TurKit: Tools for iterative tasks on mechanical turk , 2009, 2009 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC).

[35]  Daniel L. Chen,et al.  The Wages of Pay Cuts: Evidence from a Field Experiment , 2009 .

[36]  John Joseph Horton,et al.  Online Labor Markets , 2010, WINE.

[37]  Panagiotis G. Ipeirotis,et al.  Running Experiments on Amazon Mechanical Turk , 2010, Judgment and Decision Making.

[38]  Panagiotis G. Ipeirotis Demographics of Mechanical Turk , 2010 .

[39]  John Joseph Horton,et al.  The Condition of the Turking Class: Are Online Employers Fair and Honest? , 2010, ArXiv.

[40]  Lydia B. Chilton,et al.  The labor economics of paid crowdsourcing , 2010, EC '10.

[41]  Cooperation and Contagion in Web-Based, Networked Public Goods Experiments , 2010, PloS one.

[42]  Duncan J. Watts,et al.  Cooperation and Contagion in Web-Based, Networked Public Goods Experiments , 2010, SECO.

[43]  Dana Chandler,et al.  Breaking Monotony with Meaning: Motivation in Crowdsourcing Markets , 2012, ArXiv.

[44]  Siddharth Suri,et al.  Conducting behavioral research on Amazon’s Mechanical Turk , 2010, Behavior research methods.