Detecting Anomalous Reviewers and Estimating Summaries from Early Reviews Considering Heterogeneity

Early reviews, posted on online review sites shortly after products enter the market, are useful for estimating long-term evaluations of those products and making decisions. However, such reviews can be influenced easily by anomalous reviewers, including malicious and fraudulent reviewers, because the number of early reviews is usually small. It is therefore challenging to detect anomalous reviewers from early reviews and estimate long-term evaluations by reducing their influences. We find that two characteristics of heterogeneity on actual review sites such as Amazon.com cause difficulty in detecting anomalous reviewers from early reviews. We propose ideas for consideration of heterogeneity, and a methodology for computing reviewers’ degree of anomaly and estimating longterm evaluations simultaneously. Our experimental evaluations with actual reviews from Amazon.com revealed that our proposed method achieves the best performance in 19 of 20 tests compared to state-of-the-art methodologies. key words: early reviews, heterogeneity, data mining, bipartite graph

[1]  Bing Liu,et al.  Analyzing and Detecting Review Spam , 2007, Seventh IEEE International Conference on Data Mining (ICDM 2007).

[2]  Christos Faloutsos,et al.  Opinion Fraud Detection in Online Reviews by Network Effects , 2013, ICWSM.

[3]  Daisuke Ikeda,et al.  Learning to Shift the Polarity of Words for Sentiment Classification , 2008, IJCNLP.

[4]  Junhui Wang,et al.  Detecting group review spam , 2011, WWW.

[5]  Hyun Ah Song,et al.  FRAUDAR: Bounding Graph Fraud in the Face of Camouflage , 2016, KDD.

[6]  Philip S. Yu,et al.  Review Graph Based Online Store Review Spammer Detection , 2011, 2011 IEEE 11th International Conference on Data Mining.

[7]  Ming Zhou,et al.  Low-Quality Product Review Detection in Opinion Summarization , 2007, EMNLP.

[8]  Ee-Peng Lim,et al.  Detecting product review spammers using rating behaviors , 2010, CIKM.

[9]  Claire Cardie,et al.  Finding Deceptive Opinion Spam by Any Stretch of the Imagination , 2011, ACL.

[10]  Mitsuru Ishizuka,et al.  Affect Analysis Model: novel rule-based approach to affect sensing from text , 2010, Natural Language Engineering.

[11]  Philip S. Yu,et al.  Review spam detection via temporal pattern discovery , 2012, KDD.

[12]  E. David,et al.  Networks, Crowds, and Markets: Reasoning about a Highly Connected World , 2010 .

[13]  Maria Soledad Pera,et al.  An Unsupervised Sentiment Classifier on Summarized or Full Reviews , 2010, WISE.

[14]  Ke Wang,et al.  Summarizing Review Scores of "Unequal" Reviewers , 2007, SDM.

[15]  Masatoshi Yoshikawa,et al.  A Bipartite Graph Model and Mutually Reinforcing Analysis for Review Sites , 2011, DEXA.

[16]  Yejin Choi,et al.  Distributional Footprints of Deceptive Product Reviews , 2012, ICWSM.

[17]  Bing Liu,et al.  Opinion spam and analysis , 2008, WSDM '08.

[18]  Arjun Mukherjee,et al.  Exploiting Burstiness in Reviews for Review Spammer Detection , 2021, ICWSM.

[19]  Arjun Mukherjee,et al.  Spotting fake reviewer groups in consumer reviews , 2012, WWW.