Bi-level Clustering in Telecommunication Fraud

In this paper we describe a fraud detection clustering algorithm applied to the telecom industry. This is an ongoing work that is being developed in collaboration with a leading telecom operator. The choice of clustering algorithms is justified by the need of identifying clients’ abnormal behaviors through the analysis of huge amounts of data. We propose a novel bi-level clustering methodology, where the first level is concerned with the clustering of transactional data and the second level gathers data from the first phase, along with other information, to build high-level clusters.

[1]  Claudio A. Perez,et al.  Subscription fraud prevention in telecommunications using fuzzy rules and neural networks , 2006, Expert Syst. Appl..

[2]  Corinna Cortes,et al.  Communities of interest , 2001, Intell. Data Anal..

[3]  Huan Liu,et al.  Subspace clustering for high dimensional data: a review , 2004, SKDD.

[4]  Hideyuki Takagi,et al.  Introduction to Fuzzy Systems, Neural Networks, and Genetic Algorithms , 1997 .

[5]  John Shawe-Taylor,et al.  Novel Techniques for Fraud Detection in Mobile Telecommunication Networks , 2007 .

[6]  Guido Dedene,et al.  A case study of applying boosting naive Bayes to claim fraud diagnosis , 2004, IEEE Transactions on Knowledge and Data Engineering.

[7]  Sudipto Guha,et al.  ROCK: a robust clustering algorithm for categorical attributes , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[8]  Pavel Berkhin,et al.  A Survey of Clustering Data Mining Techniques , 2006, Grouping Multidimensional Data.

[9]  Damminda Alahakoon,et al.  Minority report in fraud detection: classification of skewed data , 2004, SKDD.

[10]  David J. Hand,et al.  Construction and Assessment of Classification Rules , 1997 .

[11]  Erland Jonsson,et al.  Synthesizing test data for fraud detection systems , 2003, 19th Annual Computer Security Applications Conference, 2003. Proceedings..

[12]  Janez Bester,et al.  Bidirectional Artificial Neural Networks for Mobile-Phone Fraud Detection , 2009 .

[13]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[14]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[15]  Joshua Zhexue Huang,et al.  Extensions to the k-Means Algorithm for Clustering Large Data Sets with Categorical Values , 1998, Data Mining and Knowledge Discovery.

[16]  Corinna Cortes,et al.  Signature-Based Methods for Data Streams , 2001, Data Mining and Knowledge Discovery.

[17]  Sudipto Guha,et al.  ROCK: A Robust Clustering Algorithm for Categorical Attributes , 2000, Inf. Syst..