Understading Black Boxes: Knowledge Induction From Models

Due to regurations and laws prohibiting uses of private data on customers and their transactions in customer data base, most customer data sets are not easily accessable even in the same organizations. A solutio for this reguatory problems can be providing statistical summary of the data or models induced from the dat, instead of providing raw data sets. The models, however, have limited information on the original raw data set. This study explores possible solutions for these problems. The study uses prediction models from data on credit information of customers provided by a local bank in Seoul, S. Korea. This study suggests approaches in figuring what is inside of the non-rules based models such as regression models or neural network models. The study proposes several rule accumulation algorithms such as (RAA) and a GA-based rule refinement algorithm (GA-RRA) as possible solutions for the problems. The experiments show the performance of the random dataset, RAA, elimination of redundant rules (ERR), and GA-RRA.

[1]  David West,et al.  Neural network credit scoring models , 2000, Comput. Oper. Res..

[2]  D. E. Goldberg,et al.  Genetic Algorithms in Search , 1989 .

[3]  Casimir A. Kulikowski,et al.  Computer Systems That Learn: Classification and Prediction Methods from Statistics, Neural Nets, Machine Learning and Expert Systems , 1990 .

[4]  L. Darrell Whitley,et al.  The GENITOR Algorithm and Selection Pressure: Why Rank-Based Allocation of Reproductive Trials is Best , 1989, ICGA.

[5]  Kalyanmoy Deb,et al.  A Comparative Analysis of Selection Schemes Used in Genetic Algorithms , 1990, FOGA.

[6]  Sholom M. Weiss,et al.  Automatic Knowledge Base Refinement for Classification Systems , 1988, Artif. Intell..

[7]  Michael Y. Hu,et al.  Artificial neural networks in bankruptcy prediction: General framework and cross-validation analysis , 1999, Eur. J. Oper. Res..

[8]  Michael J. Shaw,et al.  Dynamic rule refinement in knowledge-based data mining systems , 2001, Decis. Support Syst..

[9]  Jude Shavlik,et al.  Combining Explanation-Based and Neural Learning: An Algorithm and Empirical Results , 1989 .

[10]  Mu-Chen Chen,et al.  Credit scoring and rejected instances reassigning through evolutionary computation techniques , 2003, Expert Syst. Appl..

[11]  Sholom M. Weiss,et al.  Using Empirical Analysis to Refine Expert System Knowledge Bases , 1984, Artif. Intell..

[12]  Raymond J. Mooney,et al.  Refinement-based student modeling and automated bug library construction , 1996 .

[13]  J. Ross Quinlan,et al.  Decision trees and decision-making , 1990, IEEE Trans. Syst. Man Cybern..

[14]  Amir F. Atiya,et al.  Bankruptcy prediction for credit risk using neural networks: A survey and new results , 2001, IEEE Trans. Neural Networks.

[15]  Chih-Fong Tsai,et al.  Using neural network ensembles for bankruptcy prediction and credit scoring , 2008, Expert Syst. Appl..

[16]  Dursun Delen,et al.  A Scalable Classification Algorithm for Very Large Datasets , 2005, J. Inf. Knowl. Manag..

[17]  Stephen Muggleton,et al.  Efficient Induction of Logic Programs , 1990, ALT.

[18]  Raymond J. Mooney,et al.  Automated refinement of first-order horn-clause domain theories , 2005, Machine Learning.

[19]  Raymond J. Mooney,et al.  Combining Connectionist and Symbolic Learning to Refine Certainty Factor Rule Bases , 1993 .

[20]  Raymond J. Mooney,et al.  Extending Theory Refinement to M-of-N Rules , 1993, Informatica.

[21]  Deborah R. Carvalho,et al.  A hybrid decision tree/genetic algorithm method for data mining , 2004, Inf. Sci..

[22]  Raymond J. Mooney,et al.  Theory Refinement Combining Analytical and Empirical Methods , 1994, Artif. Intell..

[23]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[24]  Raymond J. Mooney,et al.  Theory Refinement of Bayesian Networks with Hidden Variables , 1998, ICML.

[25]  A FreitasAlex,et al.  A hybrid decision tree/genetic algorithm method for data mining , 2004 .