Spam Review Detection with Graph Convolutional Networks

Reviews on online shopping websites affect the buying decisions of customers, meanwhile, attract lots of spammers aiming at misleading buyers. Xianyu, the largest second-hand goods app in China, suffering from spam reviews. The anti-spam system of Xianyu faces two major challenges: scalability of the data and adversarial actions taken by spammers. In this paper, we present our technical solutions to address these challenges. We propose a large-scale anti-spam method based on graph convolutional networks (GCN) for detecting spam advertisements at Xianyu, named GCN-based Anti-Spam (GAS) model. In this model, a heterogeneous graph and a homogeneous graph are integrated to capture the local context and global context of a comment. Offline experiments show that the proposed method is superior to our baseline model in which the information of reviews, features of users and items being reviewed are utilized. Furthermore, we deploy our system to process million-scale data daily at Xianyu. The online performance also demonstrates the effectiveness of the proposed method.

[1]  Xiao-Ming Wu,et al.  Deeper Insights into Graph Convolutional Networks for Semi-Supervised Learning , 2018, AAAI.

[2]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[3]  Kai Li,et al.  Efficient k-nearest neighbor graph construction for generic similarity measures , 2011, WWW.

[4]  Wenwu Zhu,et al.  Deep Learning on Graphs: A Survey , 2018, IEEE Transactions on Knowledge and Data Engineering.

[5]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[6]  Yanfang Ye,et al.  Heterogeneous Graph Attention Network , 2019, WWW.

[7]  Jon M Kleinberg,et al.  Hubs, authorities, and communities , 1999, CSUR.

[8]  Max Welling,et al.  Modeling Relational Data with Graph Convolutional Networks , 2017, ESWC.

[9]  Yizhou Sun,et al.  Mining Heterogeneous Information Networks: Principles and Methodologies , 2012, Mining Heterogeneous Information Networks: Principles and Methodologies.

[10]  Dik Lun Lee,et al.  Billion-scale Commodity Embedding for E-commerce Recommendation in Alibaba , 2018, KDD.

[11]  Claire Cardie,et al.  Finding Deceptive Opinion Spam by Any Stretch of the Imagination , 2011, ACL.

[12]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[13]  Le Song,et al.  GRAM: Graph-based Attention Model for Healthcare Representation Learning , 2016, KDD.

[14]  Jure Leskovec,et al.  node2vec: Scalable Feature Learning for Networks , 2016, KDD.

[15]  Lei Zhang,et al.  A Survey of Opinion Mining and Sentiment Analysis , 2012, Mining Text Data.

[16]  Pietro Liò,et al.  Graph Attention Networks , 2017, ICLR.

[17]  Xiangnan He,et al.  Attributed Social Network Embedding , 2017, IEEE Transactions on Knowledge and Data Engineering.

[18]  Jure Leskovec,et al.  Inductive Representation Learning on Large Graphs , 2017, NIPS.

[19]  Qiang Yang,et al.  Beyond Personalization: Social Content Recommendation for Creator Equality and Consumer Satisfaction , 2019, KDD.

[20]  Sarunas Girdzijauskas,et al.  Adaptive Graph-based algorithms for Spam Detection in Social Networks , 2016 .

[21]  Le Song,et al.  Heterogeneous Graph Neural Networks for Malicious Account Detection , 2018, CIKM.

[22]  Sanjeev Arora,et al.  A Simple but Tough-to-Beat Baseline for Sentence Embeddings , 2017, ICLR.

[23]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[24]  Ken-ichi Kawarabayashi,et al.  Representation Learning on Graphs with Jumping Knowledge Networks , 2018, ICML.

[25]  Bing Liu,et al.  Opinion spam and analysis , 2008, WSDM '08.

[26]  Steven Skiena,et al.  DeepWalk: online learning of social representations , 2014, KDD.

[27]  Reza Farahbakhsh,et al.  NetSpam: A Network-Based Spam Detection Framework for Reviews in Online Social Media , 2017, IEEE Transactions on Information Forensics and Security.

[28]  Yuefeng Li,et al.  Aspect-Based Opinion Extraction from Customer reviews , 2014, CSE 2014.

[29]  Jinfeng Yi,et al.  Edge Attention-based Multi-Relational Graph Convolutional Networks , 2018, ArXiv.

[30]  Jure Leskovec,et al.  Graph Convolutional Neural Networks for Web-Scale Recommender Systems , 2018, KDD.

[31]  Philip S. Yu,et al.  Identify Online Store Review Spammers via Social Review Graph , 2012, TIST.

[32]  Sarunas Girdzijauskas,et al.  AdaGraph: Adaptive Graph-Based Algorithms for Spam Detection in Social Networks , 2017, NETYS.

[33]  Jure Leskovec,et al.  How Powerful are Graph Neural Networks? , 2018, ICLR.

[34]  Dik Lun Lee,et al.  Meta-Graph Based Recommendation Fusion over Heterogeneous Information Networks , 2017, KDD.

[35]  Haibin Cheng,et al.  Real-time Personalization using Embeddings for Search Ranking at Airbnb , 2018, KDD.

[36]  Marcin Luckner,et al.  Stable web spam detection using features based on lexical items , 2014, Comput. Secur..