No Place to Hide: Catching Fraudulent Entities in Tensors

Many approaches focus on detecting dense blocks in the tensor of multimodal data to prevent fraudulent entities (e.g., accounts, links) from retweet boosting, hashtag hijacking, link advertising, etc. However, no existing method is effective to find the dense block if it only possesses high density on a subset of all dimensions in tensors. In this paper, we novelly identify dense-block detection with dense-subgraph mining, by modeling a tensor into a weighted graph without any density information lost. Based on the weighted graph, which we call information sharing graph (ISG), we propose an algorithm for finding multiple densest subgraphs, D-Spot, that is faster (up to 11x faster than the state-of-the-art algorithm) and can be computed in parallel. In an N-dimensional tensor, the entity group found by the ISG+D-Spot is at least 1/2 of the optimum with respect to density, compared with the 1/N guarantee ensured by competing methods. We use nine datasets to demonstrate that ISG+D-Spot becomes new state-of-the-art dense-block detection method in terms of accuracy specifically for fraud detection.

[1]  Charu C. Aggarwal,et al.  A Survey of Algorithms for Dense Subgraph Discovery , 2010, Managing and Mining Graph Data.

[2]  Ming-Yang Su Real-time anomaly detection systems for Denial-of-Service attacks by weighted k-nearest-neighbor classifiers , 2011, Expert Syst. Appl..

[3]  Christos Faloutsos,et al.  Inferring lockstep behavior from connectivity pattern in large graphs , 2016, Knowledge and Information Systems.

[4]  Moses Charikar,et al.  Greedy approximation algorithms for finding dense components in a graph , 2000, APPROX.

[5]  Christos Faloutsos,et al.  CatchSync: catching synchronized behavior in large directed graphs , 2014, KDD.

[6]  Samir Khuller,et al.  Dense Subgraphs with Restrictions and Applications to Gene Annotation Graphs , 2010, RECOMB.

[7]  Christos Faloutsos,et al.  D-Cube: Dense-Block Detection in Terabyte-Scale Tensors , 2017, WSDM.

[8]  Damon McCoy,et al.  Dialing Back Abuse on Phone Verified Accounts , 2014, CCS.

[9]  Qiang Cao,et al.  Uncovering Large Groups of Active Malicious Accounts in Online Social Networks , 2014, CCS.

[10]  Christos Faloutsos,et al.  EigenSpokes: Surprising Patterns and Scalable Community Chipping in Large Graphs , 2010, PAKDD.

[11]  Hyun Ah Song,et al.  FRAUDAR: Bounding Graph Fraud in the Face of Camouflage , 2016, KDD.

[12]  Christos Faloutsos,et al.  Spotting Suspicious Behaviors in Multimodal Data: A General Metric and Algorithms , 2016, IEEE Transactions on Knowledge and Data Engineering.

[13]  Hua Tang,et al.  Machine Learning-based Intrusion Detection Algorithms , 2009 .

[14]  Christos Faloutsos,et al.  Spotting Suspicious Link Behavior with fBox: An Adversarial Perspective , 2014, 2014 IEEE International Conference on Data Mining.

[15]  Arjun Mukherjee,et al.  What Yelp Fake Review Filter Might Be Doing? , 2013, ICWSM.

[16]  Nikos D. Sidiropoulos,et al.  ParCube: Sparse Parallelizable Tensor Decompositions , 2012, ECML/PKDD.

[17]  Kijung Shin,et al.  Distributed Methods for High-Dimensional and Large-Scale Tensor Factorization , 2014, 2014 IEEE International Conference on Data Mining.

[18]  Hector Garcia-Molina,et al.  Combating Web Spam with TrustRank , 2004, VLDB.

[19]  Stefan Savage,et al.  Spamscatter: Characterizing Internet Scam Hosting Infrastructure , 2007, USENIX Security Symposium.

[20]  Samir Khuller,et al.  On Finding Dense Subgraphs , 2009, ICALP.

[21]  Andrew V. Goldberg,et al.  Finding a Maximum Density Subgraph , 1984 .

[22]  C. E. SHANNON,et al.  A mathematical theory of communication , 1948, MOCO.

[23]  Christos Faloutsos,et al.  Opinion Fraud Detection in Online Reviews by Network Effects , 2013, ICWSM.

[24]  Liu Xin,et al.  No Place to Hide: Catching Fraudulent Entities in Tensors , 2019, WWW 2019.

[25]  R.K. Cunningham,et al.  Evaluating intrusion detection systems: the 1998 DARPA off-line intrusion detection evaluation , 2000, Proceedings DARPA Information Survivability Conference and Exposition. DISCEX'00.

[26]  Christos Faloutsos,et al.  Netprobe: a fast and scalable system for fraud detection in online auction networks , 2007, WWW '07.

[27]  Yanfang Ye,et al.  HinDroid: An Intelligent Android Malware Detection System Based on Structured Heterogeneous Information Network , 2017, KDD.

[28]  Yousef Saad,et al.  Dense Subgraph Extraction with Application to Community Detection , 2012, IEEE Transactions on Knowledge and Data Engineering.

[29]  Christos Faloutsos,et al.  MultiAspectForensics: Pattern Mining on Large-Scale Heterogeneous Networks with Tensor Analysis , 2011, 2011 International Conference on Advances in Social Networks Analysis and Mining.

[30]  Tamara G. Kolda,et al.  Tensor Decompositions and Applications , 2009, SIAM Rev..

[31]  Christos Faloutsos,et al.  Fast, Accurate, and Flexible Algorithms for Dense Subtensor Mining , 2018, ACM Trans. Knowl. Discov. Data.

[32]  Alexander J. Smola,et al.  Fast and Guaranteed Tensor Decomposition via Sketching , 2015, NIPS.

[33]  Leman Akoglu,et al.  Collective Opinion Spam Detection: Bridging Review Networks and Metadata , 2015, KDD.

[34]  Michael J. Franklin,et al.  Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing , 2012, NSDI.

[35]  Christos Faloutsos,et al.  EigenSpokes: Surprising Patterns and Scalable Community Chipping in Large Graphs , 2009, 2009 IEEE International Conference on Data Mining Workshops.

[36]  Venkatesan Guruswami,et al.  CopyCatch: stopping group attacks by spotting lockstep behavior in social networks , 2013, WWW.

[37]  Christos Faloutsos,et al.  M-Zoom: Fast Dense-Block Detection in Tensors with Quality Guarantees , 2016, ECML/PKDD.

[38]  Christos Faloutsos,et al.  CoreScope: Graph Mining Using k-Core Analysis — Patterns, Anomalies and Algorithms , 2016, 2016 IEEE 16th International Conference on Data Mining (ICDM).