FrauDetector: A Graph-Mining-based Framework for Fraudulent Phone Call Detection

In recent years, fraud is increasing rapidly with the development of modern technology and global communication. Although many literatures have addressed the fraud detection problem, these existing works focus only on formulating the fraud detection problem as a binary classification problem. Due to limitation of information provided by telecommunication records, such classifier-based approaches for fraudulent phone call detection normally do not work well. In this paper, we develop a graph-mining-based fraudulent phone call detection framework for a mobile application to automatically annotate fraudulent phone numbers with a "fraud" tag, which is a crucial prerequisite for distinguishing fraudulent phone calls from normal phone calls. Our detection approach performs a weighted HITS algorithm to learn the trust value of a remote phone number. Based on telecommunication records, we build two kinds of directed bipartite graph: i) CPG and ii) UPG to represent telecommunication behavior of users. To weight the edges of CPG and UPG, we extract features for each pair of user and remote phone number in two different yet complementary aspects: 1) duration relatedness (DR) between user and phone number; and 2) frequency relatedness (FR) between user and phone number. Upon weighted CPG and UPG, we determine a trust value for each remote phone number. Finally, we conduct a comprehensive experimental study based on a dataset collected through an anti-fraud mobile application, Whoscall. The results demonstrate the effectiveness of our weighted HITS-based approach and show the strength of taking both DR and FR into account in feature extraction.

[1]  Hongxing He,et al.  Application of neural networks to detection of medical fraud , 1997 .

[2]  P. Picard,et al.  Economic Analysis of Insurance Fraud , 2013 .

[3]  Eric Hsueh-Chan Lu,et al.  Mining User Check-In Behavior with a Random Walk for Urban Point-of-Interest Recommendations , 2014, TIST.

[4]  M. Weatherford,et al.  Mining for fraud , 2002 .

[5]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[6]  Linden J. Ball,et al.  Using ethnography to design a mass detection tool (MDT) for the early discovery of insurance fraud , 2003, CHI Extended Abstracts.

[7]  Graham J. Williams,et al.  Mining the Knowledge Mine: The Hot Spots Methodology for Mining Large Real World Databases , 1997, Australian Joint Conference on Artificial Intelligence.

[8]  Martijn Onderwater,et al.  Detecting unusual user proles with outlier detection techniques , 2010 .

[9]  Thomas J. Watson,et al.  An empirical study of the naive Bayes classifier , 2001 .

[10]  Allan R. Wilks,et al.  Fraud Detection in Telecommunications: History and Lessons Learned , 2010, Technometrics.

[11]  Tom Fawcett,et al.  Combining Data Mining and Machine Learning for Effective User Profiling , 1996, KDD.

[12]  Diane Lambert,et al.  Detecting fraud in the real world , 2002 .

[13]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[14]  Salvatore J. Stolfo,et al.  Distributed data mining in credit card fraud detection , 1999, IEEE Intell. Syst..

[15]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[16]  Damminda Alahakoon,et al.  Minority report in fraud detection: classification of skewed data , 2004, SKDD.

[17]  Siddhartha Bhattacharyya,et al.  Data mining for credit card fraud: A comparative study , 2011, Decis. Support Syst..

[18]  Marko Bajec,et al.  An expert system for detecting automobile insurance fraud using social network analysis , 2011, Expert Syst. Appl..

[19]  P. Brockett,et al.  Using Kohonen's Self-Organizing Feature Map to Uncover Automobile Bodily Injury Claims Fraud , 1998 .

[20]  Mohd Rizam Abu Bakar,et al.  Fraud detection in telecommunication industry using Gaussian mixed model , 2013, 2013 International Conference on Research and Innovation in Information Systems (ICRIIS).

[21]  Dirk Van den Poel,et al.  Handling class imbalance in customer churn prediction , 2009, Expert Syst. Appl..

[22]  Dominik Olszewski,et al.  A probabilistic approach to fraud detection in telecommunications , 2012, Knowl. Based Syst..

[23]  Jon M Kleinberg,et al.  Hubs, authorities, and communities , 1999, CSUR.

[24]  Yong Hu,et al.  The application of data mining techniques in financial fraud detection: A classification framework and an academic review of literature , 2011, Decis. Support Syst..

[25]  Alberto Maria Segre,et al.  Programs for Machine Learning , 1994 .

[26]  Graham J. Williams Evolutionary Hot Spots Data Mining - An Architecture for Exploring for Interesting Discoveries , 1999, PAKDD.

[27]  A. Annie Portia,et al.  Analysis on credit card fraud detection methods , 2011, 2011 International Conference on Computer, Communication and Electrical Technology (ICCCET).

[28]  ПРЕГЛЕДНИ РАДОВИ,et al.  Истраживања о утицају архитектуре на испољавање јачине и снаге указују да се до-принос максималном изометријском моменту QF , појединачно сваке од његових глава мења са повећањем јачине контракције , тако да се у укупној јачини удео , 2012 .

[29]  Tom Fawcett,et al.  Adaptive Fraud Detection , 1997, Data Mining and Knowledge Discovery.

[30]  Nong Ye,et al.  Naïve Bayes Classifier , 2013 .