Handling Cold-Start Problem in Review Spam Detection by Jointly Embedding Texts and Behaviors

Solving the cold-start problem in review spam detection is an urgent and significant task. It can help the on-line review websites to relieve the damage of spammers in time, but has never been investigated by previous work. This paper proposes a novel neural network model to detect review spam for the cold-start problem, by learning to represent the new reviewers’ review with jointly embedded textual and behavioral information. Experimental results prove the proposed model achieves an effective performance and possesses preferable domain-adaptability. It is also applicable to a large-scale dataset in an unsupervised way.

[1]  Gabriella Pasi,et al.  Quantifier Guided Aggregation for the Veracity Assessment of Online Reviews , 2017, Int. J. Intell. Syst..

[2]  Yejin Choi,et al.  Syntactic Stylometry for Deception Detection , 2012, ACL.

[3]  Arjun Mukherjee,et al.  Spotting fake reviewer groups in consumer reviews , 2012, WWW.

[4]  Arjun Mukherjee,et al.  On the Temporal Dynamics of Opinion Spamming: Case Studies on Yelp , 2016, WWW.

[5]  Michael Luca Reviews, Reputation, and Revenue: The Case of Yelp.Com , 2016 .

[6]  Claire Cardie,et al.  Finding Deceptive Opinion Spam by Any Stretch of the Imagination , 2011, ACL.

[7]  Jianfeng Gao,et al.  A Persona-Based Neural Conversation Model , 2016, ACL.

[8]  Massimo Poesio,et al.  Identifying fake Amazon reviews as learning from crowds , 2014, EACL.

[9]  Jason Weston,et al.  Translating Embeddings for Modeling Multi-relational Data , 2013, NIPS.

[10]  Ryan L. Boyd,et al.  The Development and Psychometric Properties of LIWC2015 , 2015 .

[11]  Christos Faloutsos,et al.  Opinion Fraud Detection in Online Reviews by Network Effects , 2013, ICWSM.

[12]  Peng Yang,et al.  Deceptive Review Spam Detection via Exploiting Task Relatedness and Unlabeled Data , 2016, EMNLP.

[13]  Abhinav Kumar,et al.  Spotting opinion spammers using behavioral footprints , 2013, KDD.

[14]  Arjun Mukherjee,et al.  Exploiting Burstiness in Reviews for Review Spammer Detection , 2021, ICWSM.

[15]  Arjun Mukherjee,et al.  Improving Gender Classification of Blog Authors , 2010, EMNLP.

[16]  John Miller,et al.  Traversing Knowledge Graphs in Vector Space , 2015, EMNLP.

[17]  Dirk Hovy,et al.  The Enemy in Your Own Camp: How Well Can We Detect Statistically-Generated Fake Reviews – An Adversarial Study , 2016, ACL.

[18]  Yue Zhang,et al.  Deceptive Opinion Spam Detection Using Neural Network , 2016, COLING.

[19]  Arjun Mukherjee,et al.  Spotting Fake Reviews using Positive-Unlabeled Learning , 2014, Computación y Sistemas.

[20]  Kang Liu,et al.  Book Review: Sentiment Analysis: Mining Opinions, Sentiments, and Emotions by Bing Liu , 2015, CL.

[21]  Claire Cardie,et al.  TopicSpam: a Topic-Model based approach for spam detection , 2013, ACL.

[22]  Ee-Peng Lim,et al.  Detecting product review spammers using rating behaviors , 2010, CIKM.

[23]  Yejin Choi,et al.  Distributional Footprints of Deceptive Product Reviews , 2012, ICWSM.

[24]  Arjun Mukherjee,et al.  Fake Review Detection: Classification and Analysis of Real and Pseudo Reviews , 2013 .

[25]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[26]  Philip S. Yu,et al.  Review spam detection via temporal pattern discovery , 2012, KDD.

[27]  Bing Liu,et al.  Opinion spam and analysis , 2008, WSDM '08.

[28]  Claire Cardie,et al.  Towards a General Rule for Identifying Deceptive Opinion Spam , 2014, ACL.

[29]  Yi Yang,et al.  Learning to Identify Review Spam , 2011, IJCAI.

[30]  J. Pennebaker,et al.  Lying Words: Predicting Deception from Linguistic Styles , 2003, Personality & social psychology bulletin.

[31]  Christopher G. Harris Detecting Deceptive Opinion Spam Using Human Computation , 2012, HCOMP@AAAI.

[32]  Minhwan Yu,et al.  Deep Semantic Frame-Based Deceptive Opinion Spam Analysis , 2015, CIKM.

[33]  Leman Akoglu,et al.  Collective Opinion Spam Detection: Bridging Review Networks and Metadata , 2015, KDD.

[34]  Jun Zhao,et al.  Learning to Represent Review with Tensor Decomposition for Spam Detection , 2016, EMNLP.

[35]  Arjun Mukherjee,et al.  Analyzing and Detecting Opinion Spam on a Large-scale Dataset via Temporal and Spatial Patterns , 2015, ICWSM.

[36]  Christopher Potts,et al.  Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank , 2013, EMNLP.

[37]  Hai Zhao,et al.  Using Deep Linguistic Features for Finding Deceptive Opinion Spam , 2012, COLING.

[38]  Cindy K. Chung,et al.  The development and psychometric properties of LIWC2007 , 2007 .

[39]  Michael L. Anderson,et al.  Learning from the Crowd: Regression Discontinuity Estimates of the Effects of an Online Review Database , 2012 .

[40]  Ee-Peng Lim,et al.  Finding unusual review patterns using unexpected rules , 2010, CIKM.

[41]  Arjun Mukherjee,et al.  What Yelp Fake Review Filter Might Be Doing? , 2013, ICWSM.

[42]  Philip S. Yu,et al.  Review Graph Based Online Store Review Spammer Detection , 2011, 2011 IEEE 11th International Conference on Data Mining.