Today E-commerce popularity has made web an excellent source of gathering customer reviews / opinions about a product that they have purchased. The number of customer reviews that a product receives is growing at a very fast rate (It could be in hundreds or thousands). Opinion mining from product reviews, forum posts and blogs is an important research topic today with many applications. However, existing research is more focused towards classification and summarization of these online opinions. An important issue related to the trustworthiness of online opinions has been neglected most often. There is no reported study on assessing the trustworthiness of reviews, which is crucial for all opinion based applications, although web spam and email spam have been investigated extensively. In this paper, we make an attempt to detect whether a review is a spam or a non spam review, in order to provide a trusted review to help the customer in taking the proper buying decision. The trustworthiness of the reviews is assessed as spam or a non spam review which includes both duplicate and near duplicate reviews classified as spam reviews, and partially related and unique reviews classified as non spam reviews. We propose a novel and effective technique, namely, Conceptual level similarity measure used for detecting spam reviews based on the product features that have been commented in the reviews. Experimental results demonstrate the effectiveness of the proposed technique in detecting spam and non spam reviews. The efficiency of the task of web based customer review spam detection can be enhanced by identifying and eliminating duplicate and near duplicate spam reviews, thereby providing a summary of the trusted reviews for customers to make buying decisions.
[1]
Norman M. Sadeh,et al.
Learning to detect phishing emails
,
2007,
WWW '07.
[2]
Oren Etzioni,et al.
Extracting Product Features and Opinions from Reviews
,
2005,
HLT.
[3]
Ming Zhou,et al.
Low-Quality Product Review Detection in Opinion Summarization
,
2007,
EMNLP.
[4]
P. S. Hiremath,et al.
Mining Data Regions from Web Pages
,
2005,
COMAD.
[5]
Bing Liu,et al.
Analyzing and Detecting Review Spam
,
2007,
Seventh IEEE International Conference on Data Mining (ICDM 2007).
[6]
Bo Pang,et al.
Thumbs up? Sentiment Classification using Machine Learning Techniques
,
2002,
EMNLP.
[7]
Marc Najork,et al.
Detecting spam web pages through content analysis
,
2006,
WWW '06.
[8]
Bing Liu,et al.
Mining and summarizing customer reviews
,
2004,
KDD.
[9]
Luca Becchetti,et al.
A reference collection for web spam
,
2006,
SIGF.
[10]
Bing Liu,et al.
Opinion observer: analyzing and comparing opinions on the Web
,
2005,
WWW '05.
[11]
Bing Liu,et al.
Mining Opinion Features in Customer Reviews
,
2004,
AAAI.
[12]
Bing Liu,et al.
Review spam detection
,
2007,
WWW '07.
[13]
Bing Liu,et al.
Opinion spam and analysis
,
2008,
WSDM '08.