Quantifier Guided Aggregation for the Veracity Assessment of Online Reviews

The Social Web is characterized by a massive diffusion of unfiltered content, directly generated by users via the spread of different social media platforms. In this context, a challenging issue is to assess the veracity of the information generated within the sites of online reviews. To address this issue, a common practice in the literature is to select and analyze some veracity features associated with users and their reviews, by mostly applying machine learning techniques, to provide a classification in genuine and deceptive reviews. In this paper, we do not focus on the feature selection and user behavior analysis issues, but we concentrate on the aggregation process with respect to each single veracity feature. In most of the approaches based on machine learning techniques, the contribution of each feature in the classification process is not measurable by the user. For this reason, we propose a multicriteria decision making approach based both on the assessment of multiple criteria and the use of aggregation operators with the aim of obtaining a veracity score associated with each review. Based on this score, it is possible to detect fake reviews. The proposed model is evaluated on a Yelp data set by applying different aggregation schemes, and it is compared with well‐known supervised machine learning techniques.

[1]  Dong-Hong Ji,et al.  Positive Unlabeled Learning for Deceptive Reviews Detection , 2014, EMNLP.

[2]  William G. Kennedy,et al.  Social Computing, Behavioral-Cultural Modeling and Prediction , 2013, Lecture Notes in Computer Science.

[3]  A. Kaplan,et al.  Users of the world, unite! The challenges and opportunities of Social Media , 2010 .

[4]  Paolo Rosso,et al.  Detection of Opinion Spam with Character n-grams , 2015, CICLing.

[5]  Peter A. Flach,et al.  Threshold Choice Methods: the Missing Link , 2011, ArXiv.

[6]  Arjun Mukherjee,et al.  Spam Detection : An Unsupervised Approach using Generative Models , 2014 .

[7]  Hao Wu,et al.  Towards online anti-opinion spam: Spotting fake reviews from the review sequence , 2014, 2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2014).

[8]  Naomie Salim,et al.  Detection of review spam: A survey , 2015, Expert Syst. Appl..

[9]  Arjun Mukherjee,et al.  Spotting fake reviewer groups in consumer reviews , 2012, WWW.

[10]  Miriam J. Metzger,et al.  Credibility and trust of information in online environments: The use of cognitive heuristics , 2013 .

[11]  Arjun Mukherjee,et al.  Spotting Fake Reviews using Positive-Unlabeled Learning , 2014, Computación y Sistemas.

[12]  Robert C. Holte,et al.  Explicitly representing expected cost: an alternative to ROC representation , 2000, KDD '00.

[13]  R. Mesiar,et al.  Aggregation operators: new trends and applications , 2002 .

[14]  Philip S. Yu,et al.  Identify Online Store Review Spammers via Social Review Graph , 2012, TIST.

[15]  Paolo Rosso,et al.  Detecting positive and negative deceptive opinions using PU-learning , 2015, Inf. Process. Manag..

[16]  Raymond Y. K. Lau,et al.  Text mining and probabilistic language modeling for online review spam detection , 2012, TMIS.

[17]  Mohammad Ali Abbasi,et al.  Measuring User Credibility in Social Media , 2013, SBP.

[18]  Raymond Y. K. Lau,et al.  Toward a Language Modeling Approach for Consumer Review Spam Detection , 2010, 2010 IEEE 7th International Conference on E-Business Engineering.

[19]  Xifeng Yan,et al.  Synthetic review spamming and defense , 2013, WWW.

[20]  Ronald R. Yager,et al.  On ordered weighted averaging aggregation operators in multicriteria decisionmaking , 1988, IEEE Trans. Syst. Man Cybern..

[21]  Kyung Hyan Yoo,et al.  Comparison of Deceptive and Truthful Travel Reviews , 2009, ENTER.

[22]  Ronald R. Yager,et al.  Quantifier guided aggregation using OWA operators , 1996, Int. J. Intell. Syst..

[23]  Christos Faloutsos,et al.  Opinion Fraud Detection in Online Reviews by Network Effects , 2013, ICWSM.

[24]  Ronald R. Yager,et al.  On ordered weighted averaging aggregation operators in multicriteria decision-making , 1988 .

[25]  Yi Yang,et al.  Learning to Identify Review Spam , 2011, IJCAI.

[26]  Brandon Van Der Heide,et al.  Social Media as Information Source: Recency of Updates and Credibility of Information , 2014, J. Comput. Mediat. Commun..

[27]  Arjun Mukherjee,et al.  What Yelp Fake Review Filter Might Be Doing? , 2013, ICWSM.

[28]  Sibel Adali,et al.  Credibility in Context: An Analysis of Feature Distributions in Twitter , 2012, 2012 International Conference on Privacy, Security, Risk and Trust and 2012 International Confernece on Social Computing.

[29]  Alton Yeow-Kuan Chua,et al.  Understanding the process of writing fake online reviews , 2014, Ninth International Conference on Digital Information Management (ICDIM 2014).

[30]  Claire Cardie,et al.  Finding Deceptive Opinion Spam by Any Stretch of the Imagination , 2011, ACL.

[31]  Bing Liu,et al.  Opinion spam and analysis , 2008, WSDM '08.

[32]  Jiawei Han,et al.  Evaluating Event Credibility on Twitter , 2012, SDM.

[33]  Snehasish Banerjee,et al.  Applauses in hotel reviews: Genuine or deceptive? , 2014, 2014 Science and Information Conference.

[34]  Claire Cardie,et al.  Towards a General Rule for Identifying Deceptive Opinion Spam , 2014, ACL.

[35]  Didier Maillat,et al.  Biases and constraints in communication: Argumentation, persuasion and manipulation , 2013 .

[36]  Scott Counts,et al.  Tweeting is believing?: understanding microblog credibility perceptions , 2012, CSCW.

[37]  Michael Luca,et al.  Fake It Till You Make It: Reputation, Competition, and Yelp Review Fraud , 2015 .

[38]  Barbara Poblete,et al.  Predicting information credibility in time-sensitive social media , 2013, Internet Res..

[39]  Ee-Peng Lim,et al.  Detecting product review spammers using rating behaviors , 2010, CIKM.

[40]  Nitesh V. Chawla,et al.  Data Mining for Imbalanced Datasets: An Overview , 2005, The Data Mining and Knowledge Discovery Handbook.

[41]  Arjun Mukherjee,et al.  Analyzing and Detecting Opinion Spam on a Large-scale Dataset via Temporal and Spatial Patterns , 2015, ICWSM.