Deceptive Opinion Spam Detection Using Neural Network

Deceptive opinion spam detection has attracted significant attention from both business and research communities. Existing approaches are based on manual discrete features, which can capture linguistic and psychological cues. However, such features fail to encode the semantic meaning of a document from the discourse perspective, which limits the performance. In this paper, we empirically explore a neural network model to learn document-level representation for detecting deceptive opinion spam. In particular, given a document, the model learns sentence representations with a convolutional neural network, which are combined using a gated recurrent neural network with attention mechanism to model discourse information and yield a document vector. Finally, the document representation is used directly as features to identify deceptive opinion spam. Experimental results on three domains (Hotel, Restaurant, and Doctor) show that our proposed method outperforms state-of-the-art methods.

[1]  Yue Zhang,et al.  Context-Sensitive Twitter Sentiment Classification Using Neural Network , 2016, AAAI.

[2]  Yoshua Bengio,et al.  A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[3]  Daniel Jurafsky,et al.  A Hierarchical Neural Autoencoder for Paragraphs and Documents , 2015, ACL.

[4]  Claire Cardie,et al.  Towards a General Rule for Identifying Deceptive Opinion Spam , 2014, ACL.

[5]  Quoc V. Le,et al.  Distributed Representations of Sentences and Documents , 2014, ICML.

[6]  A. Vrij,et al.  Outsmarting the Liars: The Benefit of Asking Unanticipated Questions , 2009, Law and human behavior.

[7]  Ming Zhou,et al.  Adaptive Multi-Compositionality for Recursive Neural Models with Applications to Sentiment Analysis , 2014, AAAI.

[8]  Hector Garcia-Molina,et al.  Combating Web Spam with TrustRank , 2004, VLDB.

[9]  Christopher D. Manning,et al.  Global Belief Recursive Neural Networks , 2014, NIPS.

[10]  Claire Cardie,et al.  Deep Recursive Neural Networks for Compositionality in Language , 2014, NIPS.

[11]  Yoshua Bengio,et al.  Gated Feedback Recurrent Neural Networks , 2015, ICML.

[12]  Richard Socher,et al.  Ask Me Anything: Dynamic Memory Networks for Natural Language Processing , 2015, ICML.

[13]  Ting Liu,et al.  Document Modeling with Gated Recurrent Neural Network for Sentiment Classification , 2015, EMNLP.

[14]  Yejin Choi,et al.  Syntactic Stylometry for Deception Detection , 2012, ACL.

[15]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[16]  Bing Liu,et al.  Opinion spam and analysis , 2008, WSDM '08.

[17]  Phil Blunsom,et al.  Teaching Machines to Read and Comprehend , 2015, NIPS.

[18]  Geoffrey E. Hinton,et al.  Grammar as a Foreign Language , 2014, NIPS.

[19]  Dong-Hong Ji,et al.  Positive Unlabeled Learning for Deceptive Reviews Detection , 2014, EMNLP.

[20]  J. Pennebaker,et al.  Lying Words: Predicting Deception from Linguistic Styles , 2003, Personality & social psychology bulletin.

[21]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[22]  Claire Cardie,et al.  Finding Deceptive Opinion Spam by Any Stretch of the Imagination , 2011, ACL.

[23]  Alexander J. Smola,et al.  Stacked Attention Networks for Image Question Answering , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[25]  Graeme Hirst,et al.  Detecting Deceptive Opinions with Profile Compatibility , 2013, IJCNLP.

[26]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[27]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[28]  Jason Weston,et al.  Weakly Supervised Memory Networks , 2015, ArXiv.

[29]  Eduard H. Hovy,et al.  When Are Tree Structures Necessary for Deep Learning of Representations? , 2015, EMNLP.

[30]  Di Wang,et al.  A Long Short-Term Memory Model for Answer Sentence Selection in Question Answering , 2015, ACL.

[31]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[32]  Abhinav Kumar,et al.  Spotting opinion spammers using behavioral footprints , 2013, KDD.

[33]  Yue Zhang,et al.  Improving Twitter Sentiment Classification Using Topic-Enriched Multi-Prototype Word Embeddings , 2016, AAAI.

[34]  Phil Blunsom,et al.  A Convolutional Neural Network for Modelling Sentences , 2014, ACL.

[35]  Tong Zhang,et al.  Effective Use of Word Order for Text Categorization with Convolutional Neural Networks , 2014, NAACL.

[36]  Yoshua Bengio,et al.  Learning long-term dependencies with gradient descent is difficult , 1994, IEEE Trans. Neural Networks.

[37]  Christopher Potts,et al.  Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank , 2013, EMNLP.

[38]  Claire Cardie,et al.  Estimating the prevalence of deception in online review communities , 2012, WWW.

[39]  Minhwan Yu,et al.  Deep Semantic Frame-Based Deceptive Opinion Spam Analysis , 2015, CIKM.

[40]  Marc Najork,et al.  Detecting spam web pages through content analysis , 2006, WWW '06.

[41]  Kyung Hyan Yoo,et al.  Comparison of Deceptive and Truthful Travel Reviews , 2009, ENTER.