SliceNDice: Mining Suspicious Multi-Attribute Entity Groups with Multi-View Graphs

Given the reach of web platforms, bad actors have considerable incentives to manipulate and defraud users at the expense of platform integrity. This has spurred research in numerous suspicious behavior detection tasks, including detection of sybil accounts, false information, and payment scams/fraud. In this paper, we draw the insight that many such initiatives can be tackled in a common framework by posing a detection task which seeks to find groups of entities which share too many properties with one another across multiple attributes (sybil accounts created at the same time and location, propaganda spreaders broadcasting articles with the same rhetoric and with similar reshares, etc.) Our work makes four core contributions: Firstly, we posit a novel formulation of this task as a multi-view graph mining problem, in which distinct views reflect distinct attribute similarities across entities, and contextual similarity and attribute importance are respected. Secondly, we propose a novel suspiciousness metric for scoring entity groups given the abnormality of their synchronicity across multiple views, which obeys intuitive desiderata that existing metrics do not. Finally, we propose the SliceNDice algorithm which enables efficient extraction of highly suspicious entity groups, and demonstrate its practicality in production, in terms of strong detection performance and discoveries on Snapchat's large advertiser ecosystem (89% precision and numerous discoveries of real fraud rings), marked outperformance of baselines (over 97% precision/recall in simulated settings) and linear scalability.

[1]  Christos Faloutsos,et al.  BIRDNEST: Bayesian Inference for Ratings-Fraud Detection , 2015, SDM.

[2]  Hector Garcia-Molina,et al.  Combating Web Spam with TrustRank , 2004, VLDB.

[3]  Emilio Ferrara,et al.  Social Bots Distort the 2016 US Presidential Election Online Discussion , 2016, First Monday.

[4]  Martin Atzmüller,et al.  Efficient Descriptive Community Mining , 2011, FLAIRS.

[5]  Dino Ienco,et al.  Do more views of a graph help? Community detection and clustering in multi-graphs , 2013, Proceedings of the 16th International Conference on Information Fusion.

[6]  Christos Faloutsos,et al.  Inferring Strange Behavior from Connectivity Pattern in Social Networks , 2014, PAKDD.

[7]  Christos Faloutsos,et al.  Scalable community discovery from multi-faceted graphs , 2015, 2015 IEEE International Conference on Big Data (Big Data).

[8]  Vern Paxson,et al.  Trafficking Fraudulent Accounts: The Role of the Underground Market in Twitter Spam and Abuse , 2013, USENIX Security Symposium.

[9]  Hyun Ah Song,et al.  FRAUDAR: Bounding Graph Fraud in the Face of Camouflage , 2016, KDD.

[10]  Peter K. Smith,et al.  Cyberbullying: its nature and impact in secondary school pupils. , 2008, Journal of child psychology and psychiatry, and allied disciplines.

[11]  Christos Faloutsos,et al.  MultiAspectForensics: Pattern Mining on Large-Scale Heterogeneous Networks with Tensor Analysis , 2011, 2011 International Conference on Advances in Social Networks Analysis and Mining.

[12]  Neil Shah,et al.  False Information on Web and Social Media: A Survey , 2018, ArXiv.

[13]  Charu C. Aggarwal,et al.  Mining Text Data , 2012, Springer US.

[14]  Christos Faloutsos,et al.  EigenSpokes: Surprising Patterns and Scalable Community Chipping in Large Graphs , 2009, 2009 IEEE International Conference on Data Mining Workshops.

[15]  Venkatesan Guruswami,et al.  CopyCatch: stopping group attacks by spotting lockstep behavior in social networks , 2013, WWW.

[16]  Moses Charikar,et al.  Greedy approximation algorithms for finding dense components in a graph , 2000, APPROX.

[17]  Christos Faloutsos,et al.  CatchSync: catching synchronized behavior in large directed graphs , 2014, KDD.

[18]  Christos Faloutsos,et al.  M-Zoom: Fast Dense-Block Detection in Tensors with Quality Guarantees , 2016, ECML/PKDD.

[19]  Lawrence K. Saul,et al.  Beyond blacklists: learning to detect malicious web sites from suspicious URLs , 2009, KDD.

[20]  Christos Faloutsos,et al.  Spotting Suspicious Link Behavior with fBox: An Adversarial Perspective , 2014, 2014 IEEE International Conference on Data Mining.

[21]  Danai Koutra,et al.  TimeCrunch: Interpretable Dynamic Graph Summarization , 2015, KDD.

[22]  Yipeng Zhou,et al.  Analysis and Detection of Fake Views in Online Video Services , 2015, ACM Trans. Multim. Comput. Commun. Appl..

[23]  Christos Faloutsos,et al.  Inferring lockstep behavior from connectivity pattern in large graphs , 2016, Knowledge and Information Systems.

[24]  Evangelos E. Papalexakis,et al.  Semi-supervised Content-Based Detection of Misinformation via Tensor Embeddings , 2018, 2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM).

[25]  Christos Faloutsos,et al.  Fully automatic cross-associations , 2004, KDD.

[26]  Christos Faloutsos,et al.  zooRank: Ranking Suspicious Entities in Time-Evolving Tensors , 2017, ECML/PKDD.

[27]  Christos Faloutsos,et al.  Polonium: Tera-Scale Graph Mining and Inference for Malware Detection , 2011 .

[28]  S H Strogatz,et al.  Random graph models of social networks , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[29]  Qiang Cao,et al.  Uncovering Large Groups of Active Malicious Accounts in Online Social Networks , 2014, CCS.

[30]  Jae-Gil Lee,et al.  Community Detection in Multi-Layer Graphs: A Survey , 2015, SGMD.

[31]  Christos Faloutsos,et al.  Spotting Suspicious Behaviors in Multimodal Data: A General Metric and Algorithms , 2016, IEEE Transactions on Knowledge and Data Engineering.

[32]  Christos Faloutsos,et al.  MalSpot: Multi2 Malicious Network Behavior Patterns Analysis , 2014, PAKDD.

[33]  Martin Atzmüller,et al.  Description-oriented community detection using exhaustive subgroup discovery , 2016, Inf. Sci..

[34]  Charu C. Aggarwal,et al.  A Survey of Algorithms for Dense Subgraph Discovery , 2010, Managing and Mining Graph Data.

[35]  Christos Faloutsos,et al.  The Many Faces of Link Fraud , 2017, 2017 IEEE International Conference on Data Mining (ICDM).

[36]  Jean-Loup Guillaume,et al.  Fast unfolding of communities in large networks , 2008, 0803.0476.

[37]  Eugene Wu,et al.  Leveraging Quality Prediction Models for Automatic Writing Feedback , 2017, ICWSM.

[38]  Christos Faloutsos,et al.  EdgeCentric: Anomaly Detection in Edge-Attributed Networks , 2015, 2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW).

[39]  Christos Faloutsos,et al.  oddball: Spotting Anomalies in Weighted Graphs , 2010, PAKDD.

[40]  Cao Xiao,et al.  Detecting Clusters of Fake Accounts in Online Social Networks , 2015, AISec@CCS.

[41]  David Mandell Freeman,et al.  Using naive bayes to detect spammy names in social networks , 2013, AISec.