Avicaching: A Two Stage Game for Bias Reduction in Citizen Science

Citizen science projects have been very successful at collecting rich datasets for different applications. However, the data collected by the citizen scientists are often biased, more aligned with the citizens' preferences rather than scientific objectives. We introduce a novel two-stage game for reducing data-bias in citizen science in which the game organizer, a citizen-science program, incentivizes the agents, the citizen scientists, to visit under-sampled areas. We provide a novel way of encoding this two-stage game as a single optimization problem, cleverly folding (an approximation of) the agents' problems into the organizer's problem. We present several new algorithms to solve this optimization problem as well as a new structural SVM approach to learn the parameters that capture the agents' behaviors, under different incentive schemes. We apply our methodology to eBird, a well-established citizen-science program for collecting bird observations, as a game called Avicaching. We deployed Avicaching in two New York counties (March 2015), with a great response from the birding community, surpassing the expectations of the eBird organizers and bird-conservation experts. The field results show that the Avicaching incentives are remarkably effective at encouraging the bird watchers to explore under-sampled areas and hence alleviate the eBird's data bias problem.

[1]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[2]  Yang Cai,et al.  Optimum Statistical Estimation with Strategic Data Sources , 2014, COLT.

[3]  Vincent Conitzer,et al.  Computing the optimal strategy to commit to , 2006, EC '06.

[4]  Jure Leskovec,et al.  Steering user behavior with badges , 2013, WWW.

[5]  Burr Settles,et al.  Active Learning Literature Survey , 2009 .

[6]  Kagan Tumer,et al.  Analyzing and visualizing multiagent rewards in dynamic and stochastic domains , 2008, Autonomous Agents and Multi-Agent Systems.

[7]  Martin Hoefer,et al.  On Stackelberg Pricing with Computationally Bounded Consumers , 2009, WINE.

[8]  Vladlen Koltun,et al.  Near-Optimal Pricing in Near-Linear Time , 2005, WADS.

[9]  Michael P. Wellman,et al.  Online learning about other agents in a dynamic multiagent system , 1998, AGENTS '98.

[10]  David C. Parkes,et al.  Designing incentives for online question and answer forums , 2009, EC '09.

[11]  Andreas Krause,et al.  Incentivizing Users for Balancing Bike Sharing Systems , 2015, AAAI.

[12]  Sarit Kraus,et al.  Playing games for security: an efficient exact algorithm for solving Bayesian Stackelberg games , 2008, AAMAS.

[13]  Xi Chen,et al.  Optimistic Knowledge Gradient Policy for Optimal Budget Allocation in Crowdsourcing , 2013, ICML.

[14]  Michael H. Bowling,et al.  Apprenticeship learning using linear programming , 2008, ICML '08.

[15]  Yaron Singer,et al.  Pricing mechanisms for crowdsourcing markets , 2013, WWW.

[16]  Geoff V. Merrett,et al.  A Hidden Markov Model-Based Acoustic Cicada Detector for Crowdsourced Smartphone Biodiversity Monitoring , 2013, IJCAI.

[17]  Thomas Hofmann,et al.  Large Margin Methods for Structured and Interdependent Output Variables , 2005, J. Mach. Learn. Res..

[18]  Tie-Yan Liu,et al.  Generalization Analysis for Game-Theoretic Machine Learning , 2015, AAAI.

[19]  Nicholas R. Jennings,et al.  Efficient crowdsourcing of unknown experts using bounded multi-armed bandits , 2014, Artif. Intell..

[20]  Venkatesan Guruswami,et al.  On profit-maximizing envy-free pricing , 2005, SODA '05.

[21]  Rajeev Motwani,et al.  Algorithms for Multi-product Pricing , 2004, ICALP.

[22]  Masamichi Shimosaka,et al.  Steered crowdsensing: incentive design towards quality-oriented place-centric crowdsensing , 2014, UbiComp.

[23]  Mausam,et al.  Crowdsourcing Multi-Label Classification for Taxonomy Creation , 2013, HCOMP.

[24]  Patrick Minder,et al.  CrowdManager - Combinatorial Allocation and Pricing of Crowdsourcing Tasks with Time Constraints , 2012, EC 2012.

[25]  Andreas Krause,et al.  Truthful incentives in crowdsourcing tasks using regret minimization mechanisms , 2013, WWW.

[26]  Krzysztof Z. Gajos,et al.  Human computation tasks with global constraints , 2012, CHI.

[27]  J. Wickham,et al.  Completion of the 2001 National Land Cover Database for the conterminous United States , 2007 .

[28]  Eric P. Xing,et al.  An Augmented Lagrangian Approach to Constrained MAP Inference , 2011, ICML.

[29]  Yuval Rabani,et al.  Linear Programming , 2007, Handbook of Approximation Algorithms and Metaheuristics.

[30]  R. Bonney,et al.  Citizen Science: A Developing Tool for Expanding Science Knowledge and Scientific Literacy , 2009 .

[31]  Thomas G. Dietterich,et al.  The eBird enterprise: An integrated approach to development and application of citizen science , 2014 .

[32]  Andrew Y. Ng,et al.  Pharmacokinetics of a novel formulation of ivermectin after administration to goats , 2000, ICML.

[33]  Sarit Kraus,et al.  Incentive Engineering for Boolean Games , 2011, IJCAI.

[34]  Manu Sridharan,et al.  Predicting your own effort , 2012, AAMAS.

[35]  Vincent Conitzer,et al.  Learning algorithms for online principal-agent problems (and selling goods online) , 2006, ICML.

[36]  C. Lintott,et al.  Galaxy Zoo: morphologies derived from visual inspection of galaxies from the Sloan Digital Sky Survey , 2008, 0804.4483.

[37]  Sarvapali D. Ramchurn,et al.  Crowdsourcing Complex Workflows under Budget Constraints , 2015, AAAI.

[38]  Milind Tambe,et al.  When Security Games Go Green: Designing Defender Strategies to Prevent Poaching and Illegal Fishing , 2015, IJCAI.

[39]  Jon M. Kleinberg,et al.  Incentivizing exploration , 2014, EC.

[40]  Boi Faltings,et al.  Incentive Schemes for Participatory Sensing , 2015, AAMAS.

[41]  S. Shavell Risk Sharing and Incentives in the Principal and Agent Relationship , 1979 .