Opinion Fraud Detection in Online Reviews by Network Effects

User-generated online reviews can play a significant role in the success of retail products, hotels, restaurants, etc. However,review systems are often targeted by opinion spammers who seek to distort the perceived quality of a product by creating fraudulent reviews. We propose a fast and effective framework, FRAUDEAGLE, for spotting fraudsters and fake reviews in online review datasets. Our method has several advantages: (1) it exploits the network effect among reviewers and products, unlike the vast majority of existing methods that focus on review text or behavioral analysis, (2) it consists of two complementary steps; scoring users and reviews for fraud detection, and grouping for visualization and sensemaking, (3) it operates in a completely unsupervised fashion requiring no labeled data, while still incorporating side information if available, and (4) it is scalable to large datasets as its run time grows linearly with network size. We demonstrate the effectiveness of our framework on syntheticand real datasets; where FRAUDEAGLE successfully reveals fraud-bots in a large online app review database.

[1]  Philip S. Yu,et al.  Review Graph Based Online Store Review Spammer Detection , 2011, 2011 IEEE 11th International Conference on Data Mining.

[2]  Christos Faloutsos,et al.  Netprobe: a fast and scalable system for fraud detection in online auction networks , 2007, WWW '07.

[3]  Ee-Peng Lim,et al.  Finding unusual review patterns using unexpected rules , 2010, CIKM.

[4]  Charles Elkan,et al.  Link Prediction via Matrix Factorization , 2011, ECML/PKDD.

[5]  Jennifer Neville,et al.  Using relational knowledge discovery to prevent securities fraud , 2005, KDD '05.

[6]  William T. Freeman,et al.  Understanding belief propagation and its generalizations , 2003 .

[7]  Arjun Mukherjee,et al.  Spotting fake reviewer groups in consumer reviews , 2012, WWW.

[8]  Foster Provost,et al.  A Simple Relational Classifier , 2003 .

[9]  Claire Cardie,et al.  Finding Deceptive Opinion Spam by Any Stretch of the Imagination , 2011, ACL.

[10]  Yehuda Koren,et al.  Collaborative filtering with temporal dynamics , 2009, KDD.

[11]  J. Laurie Snell,et al.  Markov Random Fields and Their Applications , 1980 .

[12]  Foster J. Provost,et al.  Learning and Inference in Massive Social Networks , 2007, MLG.

[13]  Philip S. Yu,et al.  Review spam detection via temporal pattern discovery , 2012, KDD.

[14]  Ee-Peng Lim,et al.  Detecting product review spammers using rating behaviors , 2010, CIKM.

[15]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[16]  Barbara Poblete,et al.  Twitter under crisis: can we trust what we RT? , 2010, SOMA '10.

[17]  Christos Faloutsos,et al.  Fully automatic cross-associations , 2004, KDD.

[18]  Yi Yang,et al.  Learning to Identify Review Spam , 2011, IJCAI.

[19]  Christos Faloutsos,et al.  SNARE: a link analytic system for graph labeling and risk detection , 2009, KDD.

[20]  Yejin Choi,et al.  Distributional Footprints of Deceptive Product Reviews , 2012, ICWSM.

[21]  Bing Liu,et al.  Opinion spam and analysis , 2008, WSDM '08.

[22]  Yejin Choi,et al.  Syntactic Stylometry for Deception Detection , 2012, ACL.

[23]  Christos Faloutsos,et al.  RTG: a recursive realistic graph generator using random typing , 2009, Data Mining and Knowledge Discovery.