Random Walk Based Fake Account Detection in Online Social Networks

Online social networks are known to be vulnerable to the so-called Sybil attack, in which an attacker maintains massive fake accounts (also called Sybils) and uses them to perform various malicious activities. Therefore, Sybil detection is a fundamental security research problem in online social networks. Random walk based methods, which leverage the structure of an online social network to distribute reputation scores for users, have been demonstrated to be promising in certain real-world online social networks. In particular, random walk based methods have three desired features: they can have theoretically guaranteed performance for online social networks that have the fast-mixing property, they are accurate when the social network has strong homophily property, and they can be scalable to large-scale online social networks. However, existing random walk based methods suffer from several key limitations: 1) they can only leverage either labeled benign users or labeled Sybils, but not both, 2) they have limited detection accuracy for weak-homophily social networks, and 3) they are not robust to label noise in the training dataset. In this work, we propose a new random walk based Sybil detection method called SybilWalk. SybilWalk addresses the limitations of existing random walk based methods while maintaining their desired features. We perform both theoretical and empirical evaluations to compare SybilWalk with previous random walk based methods. Theoretically, for online social networks with the fast-mixing property, SybilWalk has a tighter asymptotical bound on the number of Sybils that are falsely accepted into the social network than all existing random walk based methods. Empirically, we compare SybilWalk with previous random walk based methods using both social networks with synthesized Sybils and a large-scale Twitter dataset with real Sybils. Our empirical results demonstrate that 1) SybilWalk is substantially more accurate than existing random walk based methods for weakhomophily social networks, 2) SybilWalk is substantially more robust to label noise than existing random walk based methods, and 3) SybilWalk is as scalable as the most efficient existing random walk based methods. In particular, on the Twitter dataset, SybilWalk achieves a false positive rate of 1.3% and a false negative rate of 17.3%.

[1]  Ben Y. Zhao,et al.  User interactions in social networks and their implications , 2009, EuroSys '09.

[2]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[3]  Aziz Mohaisen,et al.  Keep your friends close: Incorporating trust into social network-based Sybil defenses , 2011, 2011 Proceedings IEEE INFOCOM.

[4]  Vern Paxson,et al.  Adapting Social Spam Infrastructure for Political Censorship , 2012, LEET.

[5]  Michael Sirivianos,et al.  Aiding the Detection of Fake Accounts in Large Scale Social Online Services , 2012, NSDI.

[6]  Jong Kim,et al.  Spam Filtering in Twitter Using Sender-Receiver Relationship , 2011, RAID.

[7]  Danah Boyd,et al.  Detecting Spam in a Twitter Network , 2009, First Monday.

[8]  Aziz Mohaisen,et al.  Measuring the mixing time of social graphs , 2010, IMC '10.

[9]  Konstantin Beznosov,et al.  Integro: Leveraging Victim Prediction for Robust Fake Account Detection in OSNs , 2015, NDSS.

[10]  Hosung Park,et al.  What is Twitter, a social network or a news media? , 2010, WWW '10.

[11]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[12]  Silvio Lattanzi,et al.  SoK: The Evolution of Sybil Defense via Social Networks , 2013, 2013 IEEE Symposium on Security and Privacy.

[13]  Yufeng Wang,et al.  Poisonedwater: An improved approach for accurate reputation ranking in P2P networks , 2010, Future Gener. Comput. Syst..

[14]  Bin Liu,et al.  You Are Who You Know and How You Behave: Attribute Inference Attacks via Users' Social Friends and Behaviors , 2016, USENIX Security Symposium.

[15]  Ling Huang,et al.  Joint Link Prediction and Attribute Inference Using a Social-Attribute Network , 2014, TIST.

[16]  Shaik. AshaBee,et al.  Towards Online Spam Filtering In Social Networks , 2017 .

[17]  Fengyuan Xu,et al.  SybilDefender: Defend against sybil attacks in large social networks , 2012, 2012 Proceedings IEEE INFOCOM.

[18]  Ben Y. Zhao,et al.  Uncovering social network sybils in the wild , 2011, IMC '11.

[19]  Virgílio A. F. Almeida,et al.  Detecting Spammers on Twitter , 2010 .

[20]  Michael Kaminsky,et al.  SybilLimit: A Near-Optimal Social Network Defense against Sybil Attacks , 2008, S&P 2008.

[21]  Prateek Mittal,et al.  SmartWalk: Enhancing Social Network Security via Adaptive Random Walks , 2016, CCS.

[22]  Zoubin Ghahramani,et al.  Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions , 2003, ICML 2003.

[23]  Gianluca Stringhini,et al.  Detecting spammers on social networks , 2010, ACSAC '10.

[24]  Gang Wang,et al.  Social Turing Tests: Crowdsourcing Sybil Detection , 2012, NDSS.

[25]  Eric Gilbert,et al.  Predicting tie strength with social media , 2009, CHI.

[26]  Michael Kaminsky,et al.  SybilGuard: defending against sybil attacks via social networks , 2006, SIGCOMM.

[27]  Gang Wang,et al.  Northeastern University , 2021, IEEE Pulse.

[28]  Dawn Xiaodong Song,et al.  Suspended accounts in retrospect: an analysis of twitter spam , 2011, IMC '11.

[29]  Le Zhang,et al.  SybilSCAR: Sybil detection in online social networks via local rule based propagation , 2017, IEEE INFOCOM 2017 - IEEE Conference on Computer Communications.

[30]  Mark E. J. Newman,et al.  Power-Law Distributions in Empirical Data , 2007, SIAM Rev..

[31]  Prateek Mittal,et al.  SybilBelief: A Semi-Supervised Learning Approach for Structure-Based Sybil Detection , 2013, IEEE Transactions on Information Forensics and Security.

[32]  Dawn Xiaodong Song,et al.  Design and Evaluation of a Real-Time URL Spam Filtering Service , 2011, 2011 IEEE Symposium on Security and Privacy.

[33]  Carmela Troncoso,et al.  Drac: An Architecture for Anonymous Low-Volume Communications , 2010, Privacy Enhancing Technologies.

[34]  Chandra Prakash,et al.  SybilInfer: Detecting Sybil Nodes using Social Networks , 2011 .

[35]  Xing Xie,et al.  Robust Spammer Detection in Microblogs: Leveraging User Carefulness , 2017, TIST.

[36]  Peng Gao,et al.  SybilFrame: A Defense-in-Depth Framework for Structure-Based Sybil Detection , 2015, ArXiv.

[37]  Alex Hai Wang,et al.  Don't follow me: Spam detection in Twitter , 2010, 2010 International Conference on Security and Cryptography (SECRYPT).