HitFraud: A Broad Learning Approach for Collective Fraud Detection in Heterogeneous Information Networks

On electronic game platforms, different payment transactions have different levels of risk. Risk is generally higher for digital goods in e-commerce. However, it differs based on product and its popularity, the offer type (packaged game, virtual currency to a game or subscription service), storefront and geography. Existing fraud policies and models make decisions independently for each transaction based on transaction attributes, payment velocities, user characteristics, and other relevant information. However, suspicious transactions may still evade detection and hence we propose a broad learning approach leveraging a graph based perspective to uncover relationships among suspicious transactions, i.e., inter-transaction dependency. Our focus is to detect suspicious transactions by capturing common fraudulent behaviors that would not be considered suspicious when being considered in isolation. In this paper, we present HitFraud that leverages heterogeneous information networks for collective fraud detection by exploring correlated and fast evolving fraudulent behaviors. First, a heterogeneous information network is designed to link entities of interest in the transaction database via different semantics. Then, graph based features are efficiently discovered from the network exploiting the concept of meta-paths, and decisions on frauds are made collectively on test instances. Experiments on real-world payment transaction data from Electronic Arts demonstrate that the prediction performance is effectively boosted by HitFraud with fast convergence.

[1]  Diane J. Cook,et al.  Graph-based anomaly detection , 2003, KDD '03.

[2]  Jiawei Han,et al.  On detecting Association-Based Clique Outliers in heterogeneous information networks , 2013, 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2013).

[3]  Philip S. Yu,et al.  Meta path-based collective classification in heterogeneous information networks , 2012, CIKM.

[4]  Emmanuel Müller,et al.  Focused clustering and outlier detection in large attributed graphs , 2014, KDD.

[5]  Jiawei Han,et al.  Local Learning for Mining Outlier Subgraphs from Network Datasets , 2014, SDM.

[6]  Lise Getoor,et al.  Link-Based Classification , 2003, Encyclopedia of Machine Learning and Data Mining.

[7]  Philip S. Yu,et al.  PathSim , 2011, Proc. VLDB Endow..

[8]  Christos Faloutsos,et al.  CatchSync: catching synchronized behavior in large directed graphs , 2014, KDD.

[9]  Philip S. Yu,et al.  Collective Prediction of Multiple Types of Links in Heterogeneous Information Networks , 2014, 2014 IEEE International Conference on Data Mining.

[10]  Krishna P. Gummadi,et al.  Understanding and combating link farming in the twitter social network , 2012, WWW.

[11]  Venkatesan Guruswami,et al.  CopyCatch: stopping group attacks by spotting lockstep behavior in social networks , 2013, WWW.

[12]  Lise Getoor,et al.  Collective Classification in Network Data , 2008, AI Mag..

[13]  Bing Liu,et al.  Opinion spam and analysis , 2008, WSDM '08.

[14]  Christos Faloutsos,et al.  Spotting Suspicious Link Behavior with fBox: An Adversarial Perspective , 2014, 2014 IEEE International Conference on Data Mining.

[15]  Philip S. Yu,et al.  Inferring crowd-sourced venues for tweets , 2015, 2015 IEEE International Conference on Big Data (Big Data).

[16]  Philip S. Yu,et al.  Multi-label classification by mining label and instance correlations from heterogeneous information networks , 2013, KDD.

[17]  Jennifer Neville,et al.  Iterative Classification in Relational Data , 2000 .

[18]  Hector Garcia-Molina,et al.  Combating Web Spam with TrustRank , 2004, VLDB.

[19]  Chao Liu,et al.  Mining Behavior Graphs for "Backtrace" of Noncrashing Bugs , 2005, SDM.

[20]  Michael Sirivianos,et al.  Aiding the Detection of Fake Accounts in Large Scale Social Online Services , 2012, NSDI.

[21]  Lawrence B. Holder,et al.  Discovering Structural Anomalies in Graph-Based Data , 2007, Seventh IEEE International Conference on Data Mining Workshops (ICDMW 2007).

[22]  Jiawei Han,et al.  Ranking-based classification of heterogeneous information networks , 2011, KDD.

[23]  Yizhou Sun,et al.  On community outliers and their efficient detection in information networks , 2010, KDD.

[24]  Leman Akoglu,et al.  Scalable Anomaly Ranking of Attributed Neighborhoods , 2016, SDM.

[25]  Yangyong Zhu,et al.  NetCycle: Collective Evolution Inference in Heterogeneous Information Networks , 2016, KDD.

[26]  Jennifer Neville,et al.  Collective Classification with Relational Dependency Networks , 2003 .

[27]  Leman Akoglu,et al.  Fast Memory-efficient Anomaly Detection in Streaming Heterogeneous Graphs , 2016, KDD.

[28]  Hyun Ah Song,et al.  FRAUDAR: Bounding Graph Fraud in the Face of Camouflage , 2016, KDD.

[29]  Yizhou Sun,et al.  Ranking-based clustering of heterogeneous information networks with star network schema , 2009, KDD.

[30]  Claire Cardie,et al.  Finding Deceptive Opinion Spam by Any Stretch of the Imagination , 2011, ACL.