论文信息 - Saving Money While Polling with InterPoll Using Power Analysis

Saving Money While Polling with InterPoll Using Power Analysis

Crowd-sourcing is increasingly being used for providing responses to polls and surveys on a large scale. Companies such as SurveyMonkey and Instant.ly are attempting to make crowd-sourced surveys commonplace, by making it easy to pose survey questions using an easy-to-use UI and retrieve results with a relatively low latency by having dedicated crowds at their disposal. In this paper we argue that the ease with which polls can be created conceals an inherent difficulty: the survey maker does not know how many workers to hire for their survey. Asking too few may lead to samples sizes that `"do not look impressive enough." Asking too many clearly involves spending extra money, which can quickly become costly. Existing crowd-sourcing platforms do not provide help with this, neither, one can argue, do they have any incentive to do so. We present a systematic approach to determining how many samples (i.e. workers) are required to achieve a certain level of statistical significance by showing how to automatically perform power analysis on questions of interest. Using a range of queries we demonstrate that power analysis can save significant amounts of money and time by concluding that frequently, only a handful of results is required to arrive at a certain decision. We have implemented our approach within InterPoll, aprogrammable developer-driven polling system that uses a generic crowd (Mechanical Turk) as a back-end. Power analysis is automatically performed given both the structure of the query and the data that is being polled from the crowd. In all of our studies we are able to obtain statistically significant answers for under $30, with most costing less than $10. Our approach saves both time and money for the survey maker.

Benjamin Livshits | Todd Mytkowicz | B. Livshits | Todd Mytkowicz

[1] R. Khan,et al. Sequential Tests of Statistical Hypotheses. , 1972 .

[2] Eun Sul Lee,et al. Analyzing Complex Survey Data , 1989 .

[3] Christopher Winship,et al. Sampling Weights and Regression Analysis , 1994 .

[4] F. Vella. Estimating Models with Sample Selection Bias: A Survey , 1998 .

[5] Robert D. Tortora,et al. Principles for Constructing Web Surveys , 1998 .

[6] Jeremy C. Wyatt,et al. When to Use Web-based Surveys , 2000, J. Am. Medical Informatics Assoc..

[7] M. Couper. A REVIEW OF ISSUES AND APPROACHES , 2000 .

[8] Holly Gunn,et al. Web-based Surveys: Changing the Survey Process , 2002, First Monday.

[9] J. Wyatt,et al. Using the Internet for Surveys and Health Research , 2002, Journal of medical Internet research.

[10] J. Hanley,et al. Statistical analysis of correlated data using generalized estimating equations: an orientation. , 2003, American journal of epidemiology.

[11] Jennifer Preece,et al. Electronic Survey Methodology: A Case Study in Reaching Hard-to-Involve Internet Users , 2003, Int. J. Hum. Comput. Interact..

[12] Scott B. MacKenzie,et al. Common method biases in behavioral research: a critical review of the literature and recommended remedies. , 2003, The Journal of applied psychology.

[13] M. Banaji,et al. Psychological. , 2015, The journals of gerontology. Series B, Psychological sciences and social sciences.

[14] Thomas Lumley,et al. Analysis of Complex Survey Samples , 2004 .

[15] S. Gosling,et al. Should we trust web-based studies? A comparative analysis of six preconceptions about internet questionnaires. , 2004, The American psychologist.

[16] Joel R. Evans,et al. The value of online surveys , 2005, Internet Res..

[17] S. Keeter. The Impact of Cell Phone Noncoverage Bias on Polling in the 2004 Presidential Election , 2006 .

[18] Nick Sparrow. Developing Reliable Online Polls , 2006 .

[19] Sunghee Lee. Propensity score adjustment as a weighting scheme for volunteer panel web surveys , 2006 .

[20] Xinjiao Chen. Exact Computation of Minimum Sample Size for Estimation of Binomial Parameters , 2007 .

[21] F. Bourguignon,et al. Selection Bias Corrections Based on the Multinomial Logit Model: Monte Carlo Comparisons , 2007 .

[22] David M Erceg-Hurn,et al. Modern robust statistical methods: an easy way to maximize the accuracy and power of your research. , 2008, The American psychologist.

[23] Mario Callegaro,et al. Computing Response Metrics for Online Panels , 2008 .

[24] Joe Mayo. LINQ Programming , 2008 .

[25] G. Loosveldt,et al. An evaluation of the weighting procedures for an online access panel survey , 2008 .

[26] Matthias Schonlau,et al. Selection Bias in Web Surveys and the Use of Propensity Scores , 2006 .

[27] Lydia B. Chilton,et al. TurKit: Tools for iterative tasks on mechanical turk , 2009, 2009 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC).

[28] Sunghee Lee,et al. Estimation for Volunteer Panel Web Surveys Using Propensity Score Adjustment and Calibration Adjustment , 2009 .

[29] Aniket Kittur,et al. CrowdForge: crowdsourcing complex work , 2011, CHI Extended Abstracts.

[30] David M. McCord,et al. Evaluating the College Sophomore Problem: The Case of Personality and Politics , 2010, The Journal of psychology.

[31] Laura B. Stephenson,et al. Studying Political Behavior: A Comparison of Internet and Telephone Surveys , 2011 .

[32] Danielle E. Ramo,et al. Reliability and validity of self-reported smoking in an anonymous online survey with young adults. , 2011, Health psychology : official journal of the Division of Health Psychology, American Psychological Association.

[33] Björn Hartmann,et al. Turkomatic: automatic recursive task and workflow design for mechanical turk , 2011, Human Computation.

[34] D. Yeager,et al. Comparing the Accuracy of RDD Telephone Surveys and Internet Surveys Conducted with Probability and Non-Probability Samples , 2011 .

[35] Aniket Kittur,et al. CrowdForge: crowdsourcing complex work , 2011, UIST.

[36] Michael I. Jordan,et al. Bayesian Bias Mitigation for Crowdsourcing , 2011, NIPS.

[37] Tara S. Behrend,et al. The viability of crowdsourcing for survey research , 2011, Behavior research methods.

[38] Michael D. Buhrmester,et al. Amazon's Mechanical Turk , 2011, Perspectives on psychological science : a journal of the Association for Psychological Science.

[39] Richard Valliant,et al. Estimating Propensity Adjustments for Volunteer Web Surveys , 2011 .

[40] Aniket Kittur,et al. CrowdWeaver: visually managing complex crowd work , 2012, CSCW.

[41] S. Fienberg,et al. Current Population Survey , 2012 .

[42] Aaron D. Shaw,et al. Social desirability bias and self-reports of motivation: a study of amazon mechanical turk in the US and India , 2012, CHI.

[43] AutoMan: a platform for integrating human-based and digital computation , 2012, OOPSLA '12.

[44] Schahram Dustdar,et al. Programming Hybrid Services in the Cloud , 2012, ICSOC.

[45] Björn Hartmann,et al. Collaboratively crowdsourcing workflows with turkomatic , 2012, CSCW.

[46] M. Swan. Scaling crowdsourced health studies: the emergence of a new form of contract research organization. , 2012, Personalized medicine.

[47] Adam J. Berinsky,et al. Evaluating Online Labor Markets for Experimental Research: Amazon.com's Mechanical Turk , 2012, Political Analysis.

[48] M. Swan. Crowdsourced Health Research Studies: An Important Emerging Complement to Clinical Trials in the Public Health Research Ecosystem , 2012, Journal of medical Internet research.

[49] Chenglei Yang,et al. What? How? Where? A Survey of Crowdsourcing , 2014 .

[50] Kathryn S. McKinley,et al. Uncertain: a first-order type for uncertain data , 2014, ASPLOS.

[51] Andrew McGregor,et al. AutoMan: a platform for integrating human-based and digital computation , 2012, OOPSLA '12.