Follow spam detection based on cascaded social information

In the last decade we have witnessed the explosive growth of online social networking services (SNSs) such as Facebook, Twitter, RenRen and LinkedIn. While SNSs provide diverse benefits for example, forstering interpersonal relationships, community formations and news propagation, they also attracted uninvited nuiance. Spammers abuse SNSs as vehicles to spread spams rapidly and widely. Spams, unsolicited or inappropriate messages, significantly impair the credibility and reliability of services. Therefore, detecting spammers has become an urgent and critical issue in SNSs. This paper deals with Follow spam in Twitter. Instead of spreading annoying messages to the public, a spammer follows (subscribes to) legitimate users, and followed a legitimate user. Based on the assumption that the online relationships of spammers are different from those of legitimate users, we proposed classification schemes that detect follow spammers. Particularly, we focused on cascaded social relations and devised two schemes, TSP-Filtering and SS-Filtering, each of which utilizes Triad Significance Profile (TSP) and Social status (SS) in a two-hop subnetwork centered at each other. We also propose an emsemble technique, Cascaded-Filtering, that combine both TSP and SS properties. Our experiments on real Twitter datasets demonstrated that the proposed three approaches are very practical. The proposed schemes are scalable because instead of analyzing the whole network, they inspect user-centered two hop social networks. Our performance study showed that proposed methods yield significantly better performance than prior scheme in terms of true positives and false positives.

[1]  George Danezis,et al.  SybilInfer: Detecting Sybil Nodes using Social Networks , 2009, NDSS.

[2]  Vladimir Batagelj,et al.  A subquadratic triad census algorithm for large sparse networks with small maximum degree , 2001, Soc. Networks.

[3]  S. Shen-Orr,et al.  Superfamilies of Evolved and Designed Networks , 2004, Science.

[4]  Lakshminarayanan Subramanian,et al.  Optimal Sybil-resilient node admission control , 2011, 2011 Proceedings IEEE INFOCOM.

[5]  Fritz Heider,et al.  Social perception and phenomenal causality. , 1944 .

[6]  Feng Xiao,et al.  SybilLimit: A Near-Optimal Social Network Defense against Sybil Attacks , 2008, 2008 IEEE Symposium on Security and Privacy (sp 2008).

[7]  Juan Martínez-Romo,et al.  Detecting malicious tweets in trending topics using a statistical analysis of language , 2013, Expert Syst. Appl..

[8]  Rashmi Raj,et al.  Web Spam Detection with Anti-Trust Rank , 2006, AIRWeb.

[9]  Rizal Setya Perdana What is Twitter , 2013 .

[10]  Fengyuan Xu,et al.  SybilDefender: Defend against sybil attacks in large social networks , 2012, 2012 Proceedings IEEE INFOCOM.

[11]  Stanley Wasserman,et al.  Social Network Analysis: Methods and Applications , 1994, Structural analysis in the social sciences.

[12]  Marvin Theimer,et al.  Reclaiming space from duplicate files in a serverless distributed file system , 2002, Proceedings 22nd International Conference on Distributed Computing Systems.

[13]  Hosung Park,et al.  What is Twitter, a social network or a news media? , 2010, WWW '10.

[14]  D. Sculley,et al.  Relaxed online SVMs for spam filtering , 2007, SIGIR.

[15]  Prateek Mittal,et al.  SybilBelief: A Semi-Supervised Learning Approach for Structure-Based Sybil Detection , 2013, IEEE Transactions on Information Forensics and Security.

[16]  Yiwei Thomas Hou,et al.  SybilShield: An agent-aided social network-based Sybil defense among multiple communities , 2013, 2013 Proceedings IEEE INFOCOM.

[17]  Christos Faloutsos,et al.  CatchSync: catching synchronized behavior in large directed graphs , 2014, KDD.

[18]  P ? ? ? ? ? ? ? % ? ? ? ? , 1991 .

[19]  Hector Garcia-Molina,et al.  Combating Web Spam with TrustRank , 2004, VLDB.

[20]  F. Harary,et al.  STRUCTURAL BALANCE: A GENERALIZATION OF HEIDER'S THEORY1 , 1977 .

[21]  Brian D. Davison,et al.  Topical TrustRank: using topicality to combat web spam , 2006, WWW '06.

[22]  Krishna P. Gummadi,et al.  Understanding and combating link farming in the twitter social network , 2012, WWW.

[23]  James R. Foulds,et al.  Collective Spammer Detection in Evolving Multi-Relational Social Networks , 2015, KDD.

[24]  Jure Leskovec,et al.  Signed networks in social media , 2010, CHI.

[25]  Yiyu Yao,et al.  Cost-sensitive three-way email spam filtering , 2013, Journal of Intelligent Information Systems.

[26]  Michael Kaminsky,et al.  SybilGuard: Defending Against Sybil Attacks via Social Networks , 2008, IEEE/ACM Transactions on Networking.

[27]  András A. Benczúr,et al.  Link-Based Similarity Search to Fight Web Spam , 2006, AIRWeb.

[28]  Daniele Quercia,et al.  The Social World of Content Abusers in Community Question Answering , 2015, WWW.

[29]  Krishna P. Gummadi,et al.  Measuring User Influence in Twitter: The Million Follower Fallacy , 2010, ICWSM.

[30]  Alex Hai Wang,et al.  Don't follow me: Spam detection in Twitter , 2010, 2010 International Conference on Security and Cryptography (SECRYPT).

[31]  Fabrizio Silvestri,et al.  Know your neighbors: web spam detection using the web topology , 2007, SIGIR.

[32]  Virgílio A. F. Almeida,et al.  Detecting Spammers on Twitter , 2010 .

[33]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[34]  S. RaijaSulthana,et al.  SYBILDEFENDER: DEFEND AGAINST SYBIL ATTACKS IN LARGE SOCIAL NETWORKS , 2015 .

[35]  Michael Sirivianos,et al.  Aiding the Detection of Fake Accounts in Large Scale Social Online Services , 2012, NSDI.

[36]  Luca Becchetti,et al.  Link-Based Characterization and Detection of Web Spam , 2006, AIRWeb.

[37]  Krishna P. Gummadi,et al.  Towards Detecting Anomalous User Behavior in Online Social Networks , 2014, USENIX Security Symposium.

[38]  Padraig Cunningham,et al.  Identifying Discriminating Network Motifs in YouTube Spam , 2012, ArXiv.

[39]  Danah Boyd,et al.  Detecting Spam in a Twitter Network , 2009, First Monday.

[40]  Gianluca Stringhini,et al.  COMPA: Detecting Compromised Accounts on Social Networks , 2013, NDSS.

[41]  Gang Wang,et al.  Follow the green: growth and dynamics in twitter follower markets , 2013, Internet Measurement Conference.

[42]  Johan Hovold,et al.  Naive Bayes spam filtering using word-position-based attributes and length-sensitive classification thresholds , 2005, CEAS.

[43]  Christos Faloutsos,et al.  oddball: Spotting Anomalies in Weighted Graphs , 2010, PAKDD.

[44]  F. Heider Attitudes and cognitive organization. , 1946, The Journal of psychology.

[45]  Padraig Cunningham,et al.  Network Analysis of Recurring YouTube Spam Campaigns , 2012, ICWSM.

[46]  Yi Yang,et al.  Spam ain't as diverse as it seems: throttling OSN spam with templates underneath , 2014, ACSAC.

[47]  Gianluca Stringhini,et al.  Towards Detecting Compromised Accounts on Social Networks , 2015, IEEE Transactions on Dependable and Secure Computing.

[48]  Brian D. Davison,et al.  Identifying link farm spam pages , 2005, WWW '05.