QuickSquad: A new single-machine graph computing framework for detecting fake accounts in large-scale social networks

Graph-based approaches for fake account detection is one of the important means to fight against fake accounts’ attacks on social networks. With the growth of the scale of social networks, more and more researchers begin to use the graph computing framework to boost their detection algorithms. We make detailed analyses of social networks’ graph data and state-of-the-art graph computing frameworks, and find that some techniques of the current graph computing systems are overgeneralized and suboptimal, which means they only focus on how to design a graph processing framework on general graphs but miss the optimization of social networks graphs. So, in this paper we propose QuickSquad, a graph computing system on a single server which is specific to the optimization of social networks graph structures. QuickSquad uses the method of ”divide and rule” instead of overgeneralization. First, we divide the graph structure data into the heavy set and the light set according to the out-degree of vertices. Then, we 1) store them with different formats, 2) process them with edge-based updating and vertex-based updating appropriately in a two-phase processing model, 3) apply two selective scheduler strategies of different level, i.e. vertex-level and file-level, and 4) provide four cache priorities when the memory is not enough to cache all data. Finally, we implement two detection methods, dSybilRank and dCOLOR, on our system, and the experiments demonstrate that our system can increase the performance up to 5.91X (from 1.14X) compared with the performance of the current graph computing systems, like GridGraph.

[1]  Minyi Guo,et al.  A Social-Network-Optimized Taxi-Sharing Service , 2016, IT Professional.

[2]  Qiang Li,et al.  Towards fast and lightweight spam account detection in mobile social networks through fog computing , 2018, Peer Peer Netw. Appl..

[3]  Willy Zwaenepoel,et al.  X-Stream: edge-centric graph processing using streaming partitions , 2013, SOSP.

[4]  Jeanna Neefe Matthews,et al.  Profile characteristics of fake Twitter accounts , 2016 .

[5]  Alexander S. Szalay,et al.  Toward millions of file system IOPS on low-cost, commodity hardware , 2013, 2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[6]  Qiang Cao,et al.  Uncovering Large Groups of Active Malicious Accounts in Online Social Networks , 2014, CCS.

[7]  Xiaoheng Deng,et al.  Finding overlapping communities based on Markov chain and link clustering , 2016, Peer-to-Peer Networking and Applications.

[8]  Gianluca Stringhini,et al.  COMPA: Detecting Compromised Accounts on Social Networks , 2013, NDSS.

[9]  Yu Wang,et al.  NXgraph: An efficient graph processing system on a single machine , 2015, 2016 IEEE 32nd International Conference on Data Engineering (ICDE).

[10]  Ben Y. Zhao,et al.  Uncovering social network sybils in the wild , 2011, IMC '11.

[11]  Michalis Faloutsos,et al.  On power-law relationships of the Internet topology , 1999, SIGCOMM '99.

[12]  Tim Weninger,et al.  Thinking Like a Vertex: a Survey of Vertex-Centric Frameworks for Distributed Graph Processing , 2015, ArXiv.

[13]  Joseph M. Hellerstein,et al.  Distributed GraphLab: A Framework for Machine Learning in the Cloud , 2012, Proc. VLDB Endow..

[14]  Peng Gao,et al.  Exploiting Temporal Dynamics in Sybil Defenses , 2015, CCS.

[15]  Qiang Fu,et al.  Discovering hidden suspicious accounts in online social networks , 2017, Inf. Sci..

[16]  Hai Jin,et al.  Graph Processing on GPUs , 2018, ACM Comput. Surv..

[17]  Mark E. J. Newman,et al.  The Structure and Function of Complex Networks , 2003, SIAM Rev..

[18]  Xiao Wang,et al.  VoteTrust: Leveraging Friend Invitation Graph to Defend against Social Network Sybils , 2016, IEEE Transactions on Dependable and Secure Computing.

[19]  Tim Weninger,et al.  Thinking Like a Vertex , 2015, ACM Comput. Surv..

[20]  Guy E. Blelloch,et al.  GraphChi: Large-Scale Graph Computation on Just a PC , 2012, OSDI.

[21]  Indrajit Ray,et al.  SybilRadar: A Graph-Structure Based Framework for Sybil Detection in On-line Social Networks , 2016, SEC.

[22]  Feng Xiao,et al.  SybilLimit: A Near-Optimal Social Network Defense against Sybil Attacks , 2008, 2008 IEEE Symposium on Security and Privacy (sp 2008).

[23]  Mianxiong Dong,et al.  RMER: Reliable and Energy-Efficient Data Collection for Large-Scale Wireless Sensor Networks , 2016, IEEE Internet of Things Journal.

[24]  Wenguang Chen,et al.  GridGraph: Large-Scale Graph Processing on a Single Machine Using 2-Level Hierarchical Partitioning , 2015, USENIX ATC.

[25]  Alexander S. Szalay,et al.  FlashGraph: Processing Billion-Node Graphs on an Array of Commodity SSDs , 2014, FAST.

[26]  H. Howie Huang,et al.  Graphene: Fine-Grained IO Management for Graph Computing , 2017, FAST.

[27]  Carl D. Meyer,et al.  Deeper Inside PageRank , 2004, Internet Math..

[28]  Jure Leskovec,et al.  Defining and evaluating network communities based on ground-truth , 2012, Knowledge and Information Systems.

[29]  Reynold Xin,et al.  GraphX: Graph Processing in a Distributed Dataflow Framework , 2014, OSDI.

[30]  Jure Leskovec,et al.  Community Structure in Large Networks: Natural Cluster Sizes and the Absence of Large Well-Defined Clusters , 2008, Internet Math..

[31]  Jaakko Järvi,et al.  The Lambda Library: unnamed functions in C++ , 2003, Softw. Pract. Exp..

[32]  Jian Cao,et al.  Combating the evasion mechanisms of social bots , 2016, Comput. Secur..

[33]  Shinichi Nakajima,et al.  Minimizing Trust Leaks for Robust Sybil Detection , 2017, ICML.

[34]  Aart J. C. Bik,et al.  Pregel: a system for large-scale graph processing , 2010, SIGMOD Conference.

[35]  Michael Kaminsky,et al.  SybilGuard: defending against sybil attacks via social networks , 2006, SIGCOMM.

[36]  Gang Wang,et al.  Northeastern University , 2021, IEEE Pulse.

[37]  Haifeng Yu,et al.  Sybil defenses via social networks: a tutorial and survey , 2011, SIGA.

[38]  Krishna P. Gummadi,et al.  An analysis of social network-based Sybil defenses , 2010, SIGCOMM '10.

[39]  Joseph Gonzalez,et al.  PowerGraph: Distributed Graph-Parallel Computation on Natural Graphs , 2012, OSDI.

[40]  Leslie G. Valiant,et al.  A bridging model for parallel computation , 1990, CACM.

[41]  Jian Cao,et al.  Detection of Forwarding-Based Malicious URLs in Online Social Networks , 2016, International Journal of Parallel Programming.

[42]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[43]  Zhenguo Li,et al.  VENUS: Vertex-centric streamlined graph computation on a single PC , 2015, 2015 IEEE 31st International Conference on Data Engineering.

[44]  Wenguang Chen,et al.  Gemini: A Computation-Centric Distributed Graph Processing System , 2016, OSDI.

[45]  Krishna P. Gummadi,et al.  Measuring User Influence in Twitter: The Million Follower Fallacy , 2010, ICWSM.

[46]  Mianxiong Dong,et al.  ActiveTrust: Secure and Trustable Routing in Wireless Sensor Networks , 2016, IEEE Transactions on Information Forensics and Security.

[47]  Lakshminarayanan Subramanian,et al.  Optimal Sybil-resilient node admission control , 2011, 2011 Proceedings IEEE INFOCOM.

[48]  Christos Faloutsos,et al.  Graph evolution: Densification and shrinking diameters , 2006, TKDD.

[49]  Alexander Aiken,et al.  A Distributed Multi-GPU System for Fast Graph Processing , 2017, Proc. VLDB Endow..

[50]  Konstantin Beznosov,et al.  Integro: Leveraging Victim Prediction for Robust Fake Account Detection in OSNs , 2015, NDSS.

[51]  Jiwu Shu,et al.  FastBFS: Fast Breadth-First Graph Search on a Single Server , 2016, 2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS).

[52]  Hosung Park,et al.  What is Twitter, a social network or a news media? , 2010, WWW '10.

[53]  Aziz Mohaisen,et al.  Keep your friends close: Incorporating trust into social network-based Sybil defenses , 2011, 2011 Proceedings IEEE INFOCOM.

[54]  Minyi Guo,et al.  Mobile Target Detection in Wireless Sensor Networks With Adjustable Sensing Frequency , 2016, IEEE Systems Journal.

[55]  Minyi Guo,et al.  Joint Optimization of Lifetime and Transport Delay under Reliability Constraint Wireless Sensor Networks , 2016, IEEE Transactions on Parallel and Distributed Systems.

[56]  Muhammad Al-Qurishi,et al.  Sybil Defense Techniques in Online Social Networks: A Survey , 2017, IEEE Access.

[57]  Prateek Mittal,et al.  SmartWalk: Enhancing Social Network Security via Adaptive Random Walks , 2016, CCS.

[58]  H. Howie Huang,et al.  Enterprise: breadth-first graph traversal on GPUs , 2015, SC15: International Conference for High Performance Computing, Networking, Storage and Analysis.

[59]  Rong Chen,et al.  PowerLyra: differentiated graph computation and partitioning on skewed graphs , 2015, EuroSys.

[60]  Hector Garcia-Molina,et al.  Combating Web Spam with TrustRank , 2004, VLDB.

[61]  Alok Aggarwal,et al.  The input/output complexity of sorting and related problems , 1988, CACM.

[62]  Andrew S. Grimshaw,et al.  Scalable GPU graph traversal , 2012, PPoPP '12.

[63]  José Fernando Rodrigues,et al.  M-Flash: Fast Billion-scale Graph Computation Using Block Partition Model , 2015, ArXiv.

[64]  George Danezis,et al.  SybilInfer: Detecting Sybil Nodes using Social Networks , 2009, NDSS.