Enactment of Ensemble Learning for Review Spam Detection on Selected Features

review spam. This study aims to evaluate the performance of ensemble learning on review spam detection with selected features extracted from real and semi-real-life datasets. We study various performance metrics including Precision, Recall, F-Measure, and Receiver Operating Characteristic (RoC). Our proposed ensemble learning module (ELM) with ChiSquared feature selection technique outperformed all others with 0.851 Precision.

[1]  Yejin Choi,et al.  Syntactic Stylometry for Deception Detection , 2012, ACL.

[2]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[3]  Pat Langley,et al.  Estimating Continuous Distributions in Bayesian Classifiers , 1995, UAI.

[4]  Paolo Rosso,et al.  Detection of Opinion Spam with Character n-grams , 2015, CICLing.

[5]  Yuefeng Li,et al.  Aspect-Based Opinion Extraction from Customer reviews , 2014, CSE 2014.

[6]  Muhammad Abulaish,et al.  A generic statistical approach for spam detection in Online Social Networks , 2013, Comput. Commun..

[7]  N. Sugiura Further analysts of the data by akaike' s information criterion and the finite corrections , 1978 .

[8]  RossoPaolo,et al.  Detecting positive and negative deceptive opinions using PU-learning , 2015 .

[9]  Arjun Mukherjee,et al.  What Yelp Fake Review Filter Might Be Doing? , 2013, ICWSM.

[10]  Arjun Mukherjee,et al.  Spotting Fake Reviews using Positive-Unlabeled Learning , 2014, Computación y Sistemas.

[11]  Claire Cardie,et al.  TopicSpam: a Topic-Model based approach for spam detection , 2013, ACL.

[12]  Claire Cardie,et al.  Finding Deceptive Opinion Spam by Any Stretch of the Imagination , 2011, ACL.

[13]  Bing Liu,et al.  Analyzing and Detecting Review Spam , 2007, Seventh IEEE International Conference on Data Mining (ICDM 2007).

[14]  Ee-Peng Lim,et al.  Finding unusual review patterns using unexpected rules , 2010, CIKM.

[15]  K. Pearson On the Criterion that a Given System of Deviations from the Probable in the Case of a Correlated System of Variables is Such that it Can be Reasonably Supposed to have Arisen from Random Sampling , 1900 .

[16]  Michael Luca,et al.  Fake It Till You Make It: Reputation, Competition, and Yelp Review Fraud , 2015 .

[17]  Hai Zhao,et al.  Deceptive Opinion Spam Detection Using Deep Level Linguistic Features , 2015, NLPCC.

[18]  Taghi M. Khoshgoftaar,et al.  Survey of review spam detection using machine learning techniques , 2015, Journal of Big Data.

[19]  Claire Cardie,et al.  Negative Deceptive Opinion Spam , 2013, NAACL.

[20]  Bing Liu,et al.  Opinion spam and analysis , 2008, WSDM '08.