Cost-based modeling for fraud and intrusion detection: results from the JAM project

We describe the results achieved using the JAM distributed data mining system for the real world problem of fraud detection in financial information systems. For this domain we provide clear evidence that state-of-the-art commercial fraud detection systems can be substantially improved in stopping losses due to fraud by combining multiple models of fraudulent transaction shared among banks. We demonstrate that the traditional statistical metrics used to train and evaluate the performance of learning systems (ie. statistical accuracy or ROC analysis) are misleading and perhaps inappropriate for this application. Cost-based metrics are more relevant in certain domains, and defining such metrics poses significant and interesting research questions both in evaluating systems and alternative models, and in formalizing the problems to which one may wish to apply data mining technologies. This paper also demonstrates how the techniques developed for fraud detection can be generalized and applied to the important area of intrusion detection in networked information systems. We report the outcome of recent evaluations of our system applied to tcpdump network intrusion data specifically with respect to statistical accuracy. This work involved building additional components of JAM that we have come to call, MADAM ID (Mining Audit Data for Automated Models for Intrusion Detection). However, taking the next step to define cost-based models for intrusion detection poses interesting new research questions. We describe our initial ideas about how to evaluate intrusion detection systems using cost models learned during our work on fraud detection.

[1]  Salvatore J. Stolfo,et al.  Credit Card Fraud Detection Using Meta-Learning: Issues and Initial Results 1 , 1997 .

[2]  Ming Tan,et al.  Two Case Studies in Cost-Sensitive Concept Acquisition , 1990, AAAI.

[3]  Salvatore J. Stolfo,et al.  Mining in a data-flow environment: experience in network intrusion detection , 1999, KDD '99.

[4]  Philip K. Chan Scaling Learning by Meta-Learning over Disjoint and Partially Replicated Data , 1996 .

[5]  Salvatore J. Stolfo,et al.  Agent-Based Distributed Learning Applied to Fraud Detection , 1999 .

[6]  Heikki Mannila,et al.  Discovering Frequent Episodes in Sequences , 1995, KDD.

[7]  Zbigniew W. Ras,et al.  Answering Non-Standard Queries in Distributed Knowledge-Based Systems , 1998 .

[8]  Yoram Singer,et al.  Improved Boosting Algorithms Using Confidence-rated Predictions , 1998, COLT' 98.

[9]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.

[10]  Salvatore J. Stolfo,et al.  Data Mining Approaches for Intrusion Detection , 1998, USENIX Security Symposium.

[11]  Salvatore J. Stolfo,et al.  A data mining framework for building intrusion detection models , 1999, Proceedings of the 1999 IEEE Symposium on Security and Privacy (Cat. No.99CB36344).

[12]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[13]  Salvatore J. Stolfo,et al.  Mining Databases with Different Schemas: Integrating Incompatible Classifiers , 1998, KDD.

[14]  William W. Cohen Fast Effective Rule Induction , 1995, ICML.

[15]  Salvatore J. Stolfo,et al.  Sharing Learned Models among Remote Database Partitions by Local Meta-Learning , 1996, KDD.

[16]  David H. Wolpert,et al.  Stacked generalization , 1992, Neural Networks.

[17]  Peter D. Turney Cost-Sensitive Classification: Empirical Evaluation of a Hybrid Genetic Decision Tree Induction Algorithm , 1994, J. Artif. Intell. Res..

[18]  Heikki Mannila,et al.  Discovering Generalized Episodes Using Minimal Occurrences , 1996, KDD.

[19]  Philip K. Chan Toward Scalable Learning with Non-uniform Distributions: Effects and a Multi-classifier Approach , 1999 .

[20]  Salvatore J. Stolfo,et al.  AdaCost: Misclassification Cost-Sensitive Boosting , 1999, ICML.

[21]  Salvatore J. Stolfo,et al.  JAM: Java Agents for Meta-Learning over Distributed Databases , 1997, KDD.

[22]  Salvatore J. Stolfo,et al.  Toward Scalable and Parallel Inductive Learning: A Case Study in Splice Junction Prediction , 1994 .

[23]  Marlon Núñez,et al.  Economic Induction: A Case Study , 1988, EWSL.

[24]  Philip K. Chan,et al.  Learning Patterns from Unix Process Execution Traces for Intrusion Detection , 1997 .