Detecting Deceptive Opinion Spam Using Human Computation

Websites that encourage consumers to research, rate, and review products online have become an increasingly important factor in purchase decisions. This increased importance has been accompanied by a growth in deceptive opinion spam - fraudulent reviews written with the intent to sound authentic and mislead consumers. In this study, we pool deceptive reviews solicited through crowdsourcing with actual reviews obtained from product review websites. We then explore several human- and machine-based assessment methods to spot deceptive opinion spam in our pooled review set. We find that the combination of human-based assessment methods with easily-obtained statistical information generated from the review text outperforms detection methods using human assessors alone.

[1]  Thorsten Joachims,et al.  Making large scale SVM learning practical , 1998 .

[2]  Derek Greene,et al.  Distortion as a validation criterion in the identification of suspicious reviews , 2010, SOMA '10.

[3]  Claire Cardie,et al.  Finding Deceptive Opinion Spam by Any Stretch of the Imagination , 2011, ACL.

[4]  R. Preston McAfee,et al.  Who moderates the moderators?: crowdsourcing abuse detection in user-generated content , 2011, EC '11.

[5]  Bing Liu,et al.  Opinion spam and analysis , 2008, WSDM '08.

[6]  Gang Wang,et al.  Serf and turf: crowdturfing for fun and profit , 2011, WWW.

[7]  Ling Liu,et al.  Manipulation of online reviews: An analysis of ratings, readability, and sentiments , 2012, Decis. Support Syst..

[8]  B. Depaulo,et al.  Accuracy of Deception Judgments , 2006, Personality and social psychology review : an official journal of the Society for Personality and Social Psychology, Inc.

[9]  Christopher G. Harris Dirty Deeds Done Dirt Cheap: A Darker Side to Crowdsourcing , 2011, 2011 IEEE Third Int'l Conference on Privacy, Security, Risk and Trust and 2011 IEEE Third Int'l Conference on Social Computing.

[10]  F ChenStanley,et al.  An Empirical Study of Smoothing Techniques for Language Modeling , 1996, ACL.

[11]  E. Elaad Effects of feedback on the overestimated capacity to detect lies and the underestimated ability to tell lies , 2003 .

[12]  Kyung Hyan Yoo,et al.  Comparison of Deceptive and Truthful Travel Reviews , 2009, ENTER.

[13]  Alexander J. Smola,et al.  Estimating labels from label proportions , 2008, ICML '08.

[14]  A. Vrij,et al.  Accuracy and confidence in detecting truths andlies in elaborations and denials:Truth bias, lie bias and individual differences , 1999 .