Revealing, characterizing, and detecting crowdsourcing spammers: A case study in community Q&A

Crowdsourcing services have emerged and become popular on the Internet in recent years. However, evidence shows that crowdsourcing can be maliciously manipulated. In this paper, we focus on the “dark side” of the crowdsourcing services. More specifically, we investigate the spam campaigns that are originated and orchestrated on a large Chinese-based crowdsourcing website, namely ZhuBaJie.com, and track the crowd workers to their spamming behaviors on Baidu Zhidao, the largest community-based question answering (QA) site in China. By linking the spam campaigns, workers, spammer accounts, and spamming behaviors together, we are able to reveal the entire ecosystem that underlies the crowdsourcing spam attacks. We present a comprehensive and insightful analysis of the ecosystem from multiple perspectives, including the scale and scope of the spam attacks, Sybil accounts and colluding strategy employed by the spammers, workers' efforts and monetary rewards, and quality control performed by the spam campaigners, etc. We also analyze the behavioral discrepancies between the spammer accounts and the legitimate users in community QA, and present methodologies for detecting the spammers based on our understandings on the crowdsourcing spam ecosystem.

[1]  Kyumin Lee,et al.  The Dark Side of Micro-Task Marketplaces: Characterizing Fiverr and Automatically Detecting Crowdturfing , 2014, ICWSM.

[2]  Gang Wang,et al.  Serf and turf: crowdturfing for fun and profit , 2011, WWW.

[3]  Hector Garcia-Molina,et al.  Quality control for comparison microtasks , 2012, CrowdKDD '12.

[4]  F. Maxwell Harper,et al.  Facts or friends?: distinguishing informational and conversational questions in social Q&A sites , 2009, CHI.

[5]  Irwin King,et al.  Routing questions to appropriate answerers in community question answering services , 2010, CIKM.

[6]  Jussara M. Almeida,et al.  Detection of spam tipping behaviour on foursquare , 2013, WWW.

[7]  Srinivasan Venkatesh,et al.  Battling the Internet water army: Detection of hidden paid posters , 2011, 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2013).

[8]  Shyhtsun Felix Wu,et al.  Crawling Online Social Graphs , 2010, 2010 12th International Asia-Pacific Web Conference.

[9]  Howard J. Hamilton,et al.  TimeSleuth: a tool for discovering causal and temporal rules , 2002, 14th IEEE International Conference on Tools with Artificial Intelligence, 2002. (ICTAI 2002). Proceedings..

[10]  Jun Hu,et al.  Detecting and characterizing social spam campaigns , 2010, CCS '10.

[11]  Ian H. Witten,et al.  Generating Accurate Rule Sets Without Global Optimization , 1998, ICML.

[12]  Chuang Zhang,et al.  Real-time quality control for crowdsourcing relevance evaluation , 2012, 2012 3rd IEEE International Conference on Network Infrastructure and Digital Content.

[13]  Guofei Gu,et al.  Analyzing spammers' social networks for fun and profit: a case study of cyber criminal ecosystem on twitter , 2012, WWW.

[14]  Jure Leskovec,et al.  Discovering value from community activity on focused question answering sites: a case study of stack overflow , 2012, KDD.

[15]  Gang Wang,et al.  Characterizing and detecting malicious crowdsourcing , 2013, SIGCOMM.

[16]  Virgílio A. F. Almeida,et al.  Detecting Spammers and Content Promoters in Online Video Social Networks , 2009, IEEE INFOCOM Workshops 2009.

[17]  James W. Pennebaker,et al.  Predicting the perceived quality of online mathematics contributions from users' reputations , 2011, CHI.

[18]  Mark Newman,et al.  Networks: An Introduction , 2010 .

[19]  Ben Y. Zhao,et al.  Uncovering social network sybils in the wild , 2011, IMC '11.

[20]  2015 IEEE Conference on Computer Communications, INFOCOM 2015, Kowloon, Hong Kong, April 26 - May 1, 2015 , 2015, IEEE Conference on Computer Communications.

[21]  Lada A. Adamic,et al.  Knowledge sharing and yahoo answers: everyone knows something , 2008, WWW.

[22]  Kyumin Lee,et al.  Crowdturfers, Campaigns, and Social Media: Tracking and Revealing Crowdsourced Manipulation of Social Media , 2013, ICWSM.