JAM: Java Agents for Meta-Learning over Distributed Databases

In this paper, we describe the JAM system, a distributed, scalable and portable agent-based data mining system that employs a general approach to scaling data mining applications that we call meta-learning. JAM provides a set of learning programs, implemented either as JAVA applets or applications, that compute models over data stored locally at a site. JAM also provides a set of meta-learning agents for combining multiple models that were learned (perhaps) at different sites. It employs a special distribution mechanism which allows the migration of the derived models or classifier agents to other remote sites. We describe the overall architecture of the JAM system and the specific implementation currently under development at Columbia University. One of JAM's target applications is fraud and intrusion detection in financial information systems. A brief description of this learning task and JAM's applicability are also described. Interested users may download JAM from http://www.cs.columbia.edu/~sal/JAM/PROJECT.

[1]  Salvatore J. Stolfo,et al.  Credit Card Fraud Detection Using Meta-Learning: Issues and Initial Results 1 , 1997 .

[2]  Wenke Lee,et al.  Grappa: A GRAPh PAckage in Java , 1997, GD.

[3]  William W. Cohen Fast Effective Rule Induction , 1995, ICML.

[4]  David H. Wolpert,et al.  Stacked generalization , 1992, Neural Networks.

[5]  Salvatore J. Stolfo,et al.  Toward parallel and distributed learning by meta-learning , 1993 .

[6]  Gregory Piatetsky-Shapiro,et al.  The KDD process for extracting useful knowledge from volumes of data , 1996, CACM.

[7]  Salvatore J. Stolfo,et al.  Learning Arbiter and Combiner Trees from Partitioned Data for Scaling Machine Learning , 1995, KDD.

[8]  William W. Cohen Fast Eeective Rule Induction , 1995 .

[9]  Peter E. Hart,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[10]  Jill P. Mesirov,et al.  An Efficient Implementation of the Back-propagation Algorithm on the Connection Machine CM-2 , 1989, NIPS.

[11]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[12]  Salvatore J. Stolfo,et al.  An extensible meta-learning approach for scalable and accurate inductive learning , 1996 .

[13]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[14]  L. Breiman Pasting Bites Together For Prediction In Large Data Sets And On-Line , 1996 .

[15]  Salvatore J. Stolfo,et al.  A Comparative Evaluation of Voting and Meta-learning on Partitioned Data , 1995, ICML.