BadLink: Combining Graph and Information-Theoretical Features for Online Fraud Group Detection

Frauds severely hurt many kinds of Internet businesses. Group-based fraud detection is a popular methodology to catch fraudsters who unavoidably exhibit synchronized behaviors. We combine both graph-based features (e.g. cluster density) and information-theoretical features (e.g. probability for the similarity) of fraud groups into two intuitive metrics. Based on these metrics, we build an extensible fraud detection framework, BadLink, to support multimodal datasets with different data types and distributions in a scalable way. Experiments on real production workload, as well as extensive comparison with existing solutions demonstrate the state-of-the-art performance of BadLink, even with sophisticated camouflage traffic.

[1]  Moses Charikar,et al.  Greedy approximation algorithms for finding dense components in a graph , 2000, APPROX.

[2]  Hyun Ah Song,et al.  FRAUDAR: Bounding Graph Fraud in the Face of Camouflage , 2016, KDD.

[3]  Yanfang Ye,et al.  HinDroid: An Intelligent Android Malware Detection System Based on Structured Heterogeneous Information Network , 2017, KDD.

[4]  Shah Neil,et al.  EdgeCentric: Anomaly Detection in Edge-Attributed Networks , 2016 .

[5]  Tong Zhang,et al.  Crowd Fraud Detection in Internet Advertising , 2015, WWW.

[6]  David J. Hand,et al.  Statistical fraud detection: A review , 2002 .

[7]  Jure Leskovec,et al.  Signed networks in social media , 2010, CHI.

[8]  Hosung Park,et al.  What is Twitter, a social network or a news media? , 2010, WWW '10.

[9]  Christos Faloutsos,et al.  Spotting Suspicious Link Behavior with fBox: An Adversarial Perspective , 2014, 2014 IEEE International Conference on Data Mining.

[10]  Christos Faloutsos,et al.  Patterns and anomalies in k-cores of real-world graphs with applications , 2018, Knowledge and Information Systems.

[11]  Stefan Savage,et al.  Spamscatter: Characterizing Internet Scam Hosting Infrastructure , 2007, USENIX Security Symposium.

[12]  Venkatesan Guruswami,et al.  CopyCatch: stopping group attacks by spotting lockstep behavior in social networks , 2013, WWW.

[13]  Yao Zhao,et al.  BotGraph: Large Scale Spamming Botnet Detection , 2009, NSDI.

[14]  Christos Faloutsos,et al.  M-Zoom: Fast Dense-Block Detection in Tensors with Quality Guarantees , 2016, ECML/PKDD.

[15]  Christos Faloutsos,et al.  Inferring lockstep behavior from connectivity pattern in large graphs , 2016, Knowledge and Information Systems.

[16]  Christos Faloutsos,et al.  FairJudge: Trustworthy User Prediction in Rating Platforms , 2017, ArXiv.

[17]  Damon McCoy,et al.  Dialing Back Abuse on Phone Verified Accounts , 2014, CCS.

[18]  Charu C. Aggarwal,et al.  A Survey of Algorithms for Dense Subgraph Discovery , 2010, Managing and Mining Graph Data.

[19]  Linton C. Freeman,et al.  The Development of Social Network Analysis—with an Emphasis on Recent Events , 2011 .

[20]  Michael J. Franklin,et al.  Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing , 2012, NSDI.

[21]  Hui Xiong,et al.  Catch Me If You Can: Detecting Pickpocket Suspects from Large-Scale Transit Records , 2016, KDD.

[22]  Qiang Cao,et al.  Uncovering Large Groups of Active Malicious Accounts in Online Social Networks , 2014, CCS.

[23]  Christos Faloutsos,et al.  Spotting Suspicious Behaviors in Multimodal Data: A General Metric and Algorithms , 2016, IEEE Transactions on Knowledge and Data Engineering.

[24]  Christos Faloutsos,et al.  PEGASUS: mining peta-scale graphs , 2011, Knowledge and Information Systems.

[25]  Reid Andersen,et al.  A local algorithm for finding dense subgraphs , 2007, TALG.

[26]  S. Santhosinidevi,et al.  Towards Detecting Compromised Accounts on Social Networks , 2018 .

[27]  Robert E. Tarjan,et al.  A Fast Parametric Maximum Flow Algorithm and Applications , 1989, SIAM J. Comput..