Characterizing and detecting malicious crowdsourcing

Popular Internet services in recent years have shown that remarkable things can be achieved by harnessing the power of the masses. However, crowd-sourcing systems also pose a real challenge to existing security mechanisms deployed to protect Internet services, particularly those tools that identify malicious activity by detecting activities of automated programs such as CAPTCHAs. In this work, we leverage access to two large crowdturfing sites to gather a large corpus of ground-truth data generated by crowdturfing campaigns. We compare and contrast this data with "organic" content generated by normal users to identify unique characteristics and potential signatures for use in real-time detectors. This poster describes first steps taken focused on crowdturfing campaigns targeting the Sina Weibo microblogging system. We describe our methodology, our data (over 290K campaigns, 34K worker accounts, 61 million tweets...), and some initial results.

[1]  Jun Hu,et al.  Detecting and characterizing social spam campaigns , 2010, CCS '10.

[2]  Gang Wang,et al.  Social Turing Tests: Crowdsourcing Sybil Detection , 2012, NDSS.

[3]  Ben Y. Zhao,et al.  Uncovering social network Sybils in the wild , 2011, ACM Trans. Knowl. Discov. Data.

[4]  Gang Wang,et al.  Serf and turf: crowdturfing for fun and profit , 2011, WWW.

[5]  Gang Wang,et al.  Northeastern University , 2021, IEEE Pulse.