论文信息 - Leveraging Clickstream Trajectories to Reveal Low-Quality Workers in Crowdsourced Forecasting Platforms

Leveraging Clickstream Trajectories to Reveal Low-Quality Workers in Crowdsourced Forecasting Platforms

Crowdwork often entails tackling cognitively-demanding and time-consuming tasks. Crowdsourcing can be used for complex annotation tasks, from medical imaging to geospatial data, and such data powers sensitive applications, such as health diagnostics or autonomous driving. However, the existence and prevalence of underperforming crowdworkers is well-recognized, and can pose a threat to the validity of crowdsourcing. In this study, we propose the use of a computational framework to identify clusters of underperforming workers using clickstream trajectories. We focus on crowdsourced geopolitical forecasting. The framework can reveal different types of underperformers, such as workers with forecasts whose accuracy is far from the consensus of the crowd, those who provide low-quality explanations for their forecasts, and those who simply copy-paste their forecasts from other users. Our study suggests that clickstream clustering and analysis are fundamental tools to diagnose the performance of crowdworkers in platforms leveraging the wisdom of crowds.

[1] David G. Rand,et al. The online laboratory: conducting experiments in a real labor market , 2010, ArXiv.

[2] Stefanie Nowak,et al. New Strategies for Image Annotation: Overview of the Photo Annotation Task at ImageCLEF 2010 , 2010, CLEF.

[3] Alon Y. Halevy,et al. Crowdsourcing systems on the World-Wide Web , 2011, Commun. ACM.

[4] Michael D. Buhrmester,et al. Amazon's Mechanical Turk , 2011, Perspectives on psychological science : a journal of the Association for Psychological Science.

[5] Jie Li,et al. Characterizing typical and atypical user sessions in clickstreams , 2008, WWW.

[6] A. Acquisti,et al. Reputation as a sufficient condition for data quality on Amazon Mechanical Turk , 2013, Behavior Research Methods.

[7] Steven V. Rouse,et al. A reliability analysis of Mechanical Turk data , 2015, Comput. Hum. Behav..

[8] M. Tamer Özsu,et al. A Web page prediction model based on click-stream tree representation of user behavior , 2003, KDD '03.

[9] A. James. 2010 , 2011, Philo of Alexandria: an Annotated Bibliography 2007-2016.

[10] Gang Wang,et al. Unsupervised Clickstream Clustering for User Behavior Analysis , 2016, CHI.

[11] Yang Liu,et al. Non-Expert Evaluation of Summarization Systems is Risky , 2010, Mturk@HLT-NAACL.

[12] Kwong-Sak Leung,et al. A Survey of Crowdsourcing Systems , 2011, 2011 IEEE Third Int'l Conference on Privacy, Security, Risk and Trust and 2011 IEEE Third Int'l Conference on Social Computing.

[13] George R. Klare,et al. The measurement of readability , 1963 .

[14] Aniket Kittur,et al. Crowdsourcing user studies with Mechanical Turk , 2008, CHI.

[15] Siddharth Suri,et al. Conducting behavioral research on Amazon’s Mechanical Turk , 2010, Behavior research methods.

[16] Mark Dredze,et al. Annotating Named Entities in Twitter Data with Crowdsourcing , 2010, Mturk@HLT-NAACL.

[17] Lyle H. Ungar,et al. The Good Judgment Project: A Large Scale Test of Different Methods of Combining Expert Predictions , 2012, AAAI Fall Symposium: Machine Aggregation of Human Judgment.

[18] Chris Kimble,et al. UBB mining: finding unexpected browsing behaviour in clickstream data to improve a Web site's design , 2005, The 2005 IEEE/WIC/ACM International Conference on Web Intelligence (WI'05).

[19] Panagiotis G. Ipeirotis,et al. Running Experiments on Amazon Mechanical Turk , 2010, Judgment and Decision Making.

[20] Lin Lu,et al. Mining Significant Usage Patterns from Clickstream Data , 2005, WEBKDD.

[21] P ? ? ? ? ? ? ? % ? ? ? ? , 1991 .

[23] Karoline Mortensen,et al. Comparing Amazon’s Mechanical Turk Platform to Conventional Data Collection Methods in the Health and Medical Research Literature , 2018, Journal of General Internal Medicine.

[24] Virgílio A. F. Almeida,et al. Characterizing user behavior in online social networks , 2009, IMC '09.

[25] R. Zeckhauser,et al. The Value of Precision in Probability Assessment: Evidence from a Large-Scale Geopolitical Forecasting Tournament , 2018 .