Learning the Prior in Minimal Peer Prediction

Many crowdsourcing applications rely on the truthful elicitation of information from workers; e.g., voting on the quality of an image label, or whether a website is inappropriate for an advertiser. Peer prediction provides a theoretical mechanism for eliciting truthful reports. However, its application depends on knowledge of a full probabilistic model: both a distribution on votes, and a posterior for each possible single vote received. In earlier work, Witkowski and Parkes [2012b], relax this requirement at the cost of “nonminimality,” i.e., users would need to both vote and report a belief about the vote of others. Other methods insist on minimality but still require knowledge of the distribution on votes, i.e., the signal prior but not the posterior [Jurca and Faltings 2008, 2011; Witkowski and Parkes 2012a]. In this paper, we develop the theoretical foundation for learning the signal prior in combination with these minimal peer-prediction methods. To score an agent, our mechanism uses the empirical frequency of reported signals against which to “shadow” [Witkowski and Parkes 2012a], delaying payments until the empirical frequency is accurate enough. We provide a bound on the number of samples required for the resulting mechanism to provide strict incentives for truthful reporting.