User profiling and classification for fraud detection in mobile communications networks

The topic of this thesis is fraud detection in mobile communications networks by means of user profiling and classification techniques. The goal is to first identify relevant user groups based on call data and then to assign a user to a relevant group. Fraud may be defined as a dishonest or illegal use of services, with the intention to avoid service charges. Fraud detection is an important application, since network operators lose a relevant portion of their revenue to fraud. Whereas the intentions of the mobile phone users cannot be observed, it is assumed that the intentions are reflected in the call data. The call data is subsequently used in describing behavioral patterns of users. Neural networks and probabilistic models are employed in learning these usage patterns from call data. These models are used either to detect abrupt changes in established usage patterns or to recognize typical usage patterns of fraud. The methods are shown to be effective in detecting fraudulent behavior by empirically testing the methods with data from real mobile communications networks. © All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without the prior permission of the author.

[1]  Tom Fawcett,et al.  Activity monitoring: noticing interesting changes in behavior , 1999, KDD '99.

[2]  Teuvo Kohonen,et al.  The self-organizing map , 1990, Neurocomputing.

[3]  R. Redner,et al.  Mixture densities, maximum likelihood, and the EM algorithm , 1984 .

[4]  Ray J. Frank,et al.  The detection of fraud in mobile phone networks , 1996 .

[5]  Til Schuermann Risk Management In The Financial Services Industry: Through A Statistical Lens , 1997 .

[6]  Joos Vandewalle,et al.  Detection of Mobile Phone Fraud Using Supervised Neural Networks: A First Prototype , 1997, ICANN.

[7]  John Shawe-Taylor,et al.  BRUTUS - A Hybrid Detection Tool , 1997 .

[8]  Biing-Hwang Juang,et al.  Hidden Markov Models for Speech Recognition , 1991 .

[9]  Eric Rosenberg,et al.  Quantitative Methods in Credit Management: A Survey , 1994, Oper. Res..

[10]  Olli Simula,et al.  Analysis of Complex Systems Using the Self-Organizing Map , 1997, ICONIP.

[11]  John Shawe-Taylor,et al.  Detecting Cellular Fraud Using Adaptive Prototypes. , 1997, AAAI 1997.

[12]  G. McLachlan,et al.  The EM algorithm and extensions , 1996 .

[13]  J A Swets,et al.  Measuring the accuracy of diagnostic systems. , 1988, Science.

[14]  J. Hanley,et al.  The meaning and use of the area under a receiver operating characteristic (ROC) curve. , 1982, Radiology.

[15]  T. Lane,et al.  Sequence Matching and Learning in Anomaly Detection for Computer Security , 1997 .

[16]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[17]  Tom Fawcett,et al.  Combining Data Mining and Machine Learning for Effective User Profiling , 1996, KDD.

[18]  Peter Hoath Telecoms fraud, the gory details , 1998 .

[19]  O. Curet,et al.  Designing and evaluating a case-based learning and reasoning agent in unstructured decision making , 1996, 1996 IEEE International Conference on Systems, Man and Cybernetics. Information Intelligence and Systems (Cat. No.96CH35929).

[20]  Salvatore J. Stolfo,et al.  Credit Card Fraud Detection Using Meta-Learning: Issues and Initial Results 1 , 1997 .

[21]  Olli Simula,et al.  A learning vector quantization algorithm for probabilistic models , 2000, 2000 10th European Signal Processing Conference.

[22]  Hongxing He,et al.  Application of neural networks to detection of medical fraud , 1997 .

[23]  D. M. Green,et al.  Signal detection theory and psychophysics , 1966 .

[24]  Vijay Hanagandi,et al.  Density-based clustering and radial basis function modeling to generate credit card fraud scores , 1996, IEEE/IAFE 1996 Conference on Computational Intelligence for Financial Engineering (CIFEr).

[25]  Isij Monitor,et al.  Network Intrusion Detection: An Analyst’s Handbook , 2000 .

[26]  B. Everitt,et al.  Finite Mixture Distributions , 1981 .

[27]  Erkki Oja,et al.  Engineering applications of the self-organizing map , 1996, Proc. IEEE.

[28]  Robert A. Jacobs,et al.  Methods For Combining Experts' Probability Assessments , 1995, Neural Computation.

[29]  Steven J. Nowlan,et al.  Soft competitive adaptation: neural network learning algorithms based on fitting statistical mixtures , 1991 .

[30]  L. Baum,et al.  An inequality and associated maximization technique in statistical estimation of probabilistic functions of a Markov process , 1972 .

[31]  José R. Dorronsoro,et al.  Neural fraud detection in credit card operations , 1997, IEEE Trans. Neural Networks.

[32]  A. Poritz,et al.  Hidden Markov models: a guided tour , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[33]  Robert J. Schalkoff,et al.  Pattern recognition : statistical, structural and neural approaches / Robert J. Schalkoff , 1992 .

[34]  A. F. Smith,et al.  Statistical analysis of finite mixture distributions , 1986 .

[35]  Douglas L. Reilly,et al.  Credit card fraud detection with a neural-network , 1994, 1994 Proceedings of the Twenty-Seventh Hawaii International Conference on System Sciences.

[36]  Robert A. Lordo,et al.  Learning from Data: Concepts, Theory, and Methods , 2001, Technometrics.

[37]  David Jensen,et al.  Prospective Assessment of AI Technologies for Fraud Detection: A Case Study , 1997 .

[38]  Dorothy E. Denning,et al.  An Intrusion-Detection Model , 1987, IEEE Transactions on Software Engineering.

[39]  Masanobu Taniguchi,et al.  Input dependent misclassification costs for cost-sensitive classifiers , 2000 .

[40]  John W. Sammon,et al.  A Nonlinear Mapping for Data Structure Analysis , 1969, IEEE Transactions on Computers.

[41]  Volker Tresp,et al.  A hidden Markov model for metric and event-based data , 2000, 2000 10th European Signal Processing Conference.

[42]  Olli Simula,et al.  A Self-Organizing Map for Clustering Probabilistic Models , 1999 .

[43]  Ron Kohavi,et al.  The Case against Accuracy Estimation for Comparing Induction Algorithms , 1998, ICML.

[44]  Rajendra P. Srivastava,et al.  Detection of management fraud: a neural network approach , 1995, Proceedings the 11th Conference on Artificial Intelligence for Applications.

[45]  L. R. Rabiner,et al.  An introduction to the application of the theory of probabilistic functions of a Markov process to automatic speech recognition , 1983, The Bell System Technical Journal.

[46]  Bernd Freisleben,et al.  CARDWATCH: a neural network based database mining system for credit card fraud detection , 1997, Proceedings of the IEEE/IAFE 1997 Computational Intelligence for Financial Engineering (CIFEr).

[47]  Michael I. Jordan,et al.  Probabilistic Independence Networks for Hidden Markov Probability Models , 1997, Neural Computation.

[48]  Paul Allen,et al.  Interactive Anomaly Detection in Large Transaction History Databases , 1996, HPCN Europe.

[49]  R. Shumway,et al.  Dynamic linear models with switching , 1991 .

[50]  Yoshua Bengio,et al.  Pattern Recognition and Neural Networks , 1995 .

[51]  Tom Fawcett,et al.  Analysis and Visualization of Classifier Performance with Nonuniform Class and Cost Distributions , 1997 .

[52]  John Shawe-Taylor,et al.  Frameworks For Fraud Detection In Mobile Telecommunications Networks , 1996 .

[53]  Michael J. Pazzani,et al.  Reducing Misclassification Costs , 1994, ICML.

[54]  Barry Glasgow Risk and Fraud in the Insurance Industry , 1997 .

[55]  Volker Tresp,et al.  Fraud detection in communication networks using neural and probabilistic methods , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[56]  Samuel Kaski,et al.  Bibliography of Self-Organizing Map (SOM) Papers: 1981-1997 , 1998 .

[57]  Michael I. Jordan,et al.  On Convergence Properties of the EM Algorithm for Gaussian Mixtures , 1996, Neural Computation.

[58]  Olli Simula,et al.  Process Monitoring and Modeling Using the Self-Organizing Map , 1999, Integr. Comput. Aided Eng..

[59]  David Haussler,et al.  Exploiting Generative Models in Discriminative Classifiers , 1998, NIPS.

[60]  Jonathan D. Cryer,et al.  Time Series Analysis , 1986 .

[61]  Olli Simula,et al.  Self-Organizing map in analysis of large-scale industrial systems , 1999 .

[62]  New York Dover,et al.  ON THE CONVERGENCE PROPERTIES OF THE EM ALGORITHM , 1983 .

[63]  William DuMouchel,et al.  A Fast Computer Intrusion Detection Algorithm Based on Hypothesis Testing of Command Transition Probabilities , 1998, KDD.

[64]  James P. Egan,et al.  Signal detection theory and ROC analysis , 1975 .

[65]  Todd L. Heberlein,et al.  Network intrusion detection , 1994, IEEE Network.

[66]  K. Leonard Detecting credit card fraud using expert systems , 1993 .

[67]  Risto Miikkulainen,et al.  Intrusion Detection with Neural Networks , 1997, NIPS.

[68]  K. Tan,et al.  The application of neural networks to UNIX computer security , 1995, Proceedings of ICNN'95 - International Conference on Neural Networks.

[69]  D20 - Project final report and results of trials , 1987 .

[70]  R. Hilgers Distribution-Free Confidence Bounds for ROC Curves , 1991, Methods of Information in Medicine.

[71]  Robert J. Schalkoff,et al.  Pattern recognition - statistical, structural and neural approaches , 1991 .

[72]  Sandeep Kumar,et al.  Classification and detection of computer intrusions , 1996 .

[73]  Carla E. Brodley,et al.  Approaches to Online Learning and Concept Drift for User Identification in Computer Security , 1998, KDD.

[74]  Kazuo J. Ezawa,et al.  Constructing Bayesian Networks to Predict Uncollectible Telecommunications Accounts , 1996, IEEE Expert.

[75]  Moninder Singh,et al.  Learning Goal Oriented Bayesian Networks for Telecommunications Risk Management , 1996, ICML.

[76]  Olli Simula,et al.  Process State Monitoring Using Self-Organizing Maps , 1992 .

[77]  S. Lauritzen The EM algorithm for graphical association models with missing data , 1995 .

[78]  R. Quandt The Estimation of the Parameters of a Linear Regression System Obeying Two Separate Regimes , 1958 .

[79]  Geoffrey E. Hinton,et al.  Adaptive Mixtures of Local Experts , 1991, Neural Computation.

[80]  Volker Tresp,et al.  Call-Based Fraud Detection in Mobile Communication Networks Using a Hierarchical Regime-Switching Model , 1998, NIPS.

[81]  Stephen Northcutt,et al.  Network Intrusion Detection: An Analyst's Hand-book , 1999 .

[82]  Belden Menkus Some Management-Directed Fraud Incidents , 1998 .

[83]  James D. Hamilton Analysis of time series subject to changes in regime , 1990 .

[84]  Kazuo J. Ezawa,et al.  Fraud/Uncollectible Debt Detection Using a Bayesian Network Based Learning System: A Rare Binary Outcome with Mixed Data Structures , 1995, UAI.

[85]  C. Metz Basic principles of ROC analysis. , 1978, Seminars in nuclear medicine.

[86]  John A. Major,et al.  EFD: A hybrid knowledge/statistical‐based system for the detection of fraud , 1992, Int. J. Intell. Syst..

[87]  Salvatore J. Stolfo,et al.  Toward Scalable Learning with Non-Uniform Class and Cost Distributions: A Case Study in Credit Card Fraud Detection , 1998, KDD.

[88]  R. Quandt A New Approach to Estimating Switching Regressions , 1972 .

[89]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..