Hybrid genetic algorithms and case‐based reasoning systems for customer classification

: Because of its convenience and strength in complex problem solving, case-based reasoning (CBR) has been widely used in various areas. One of these areas is customer classification, which classifies customers into either purchasing or non-purchasing groups. Nonetheless, compared to other machine learning techniques, CBR has been criticized because of its low prediction accuracy. Generally, in order to obtain successful results from CBR, effective retrieval of useful prior cases for the given problem is essential. However, designing a good matching and retrieval mechanism for CBR systems is still a controversial research issue. Most previous studies have tried to optimize the weights of the features or the selection process of appropriate instances. But these approaches have been performed independently until now. Simultaneous optimization of these components may lead to better performance than naive models. In particular, there have been few attempts to simultaneously optimize the weights of the features and the selection of instances for CBR. Here we suggest a simultaneous optimization model of these components using a genetic algorithm. To validate the usefulness of our approach, we apply it to two real-world cases for customer classification. Experimental results show that simultaneously optimized CBR may improve the classification accuracy and outperform various optimized models of CBR as well as other classification models including logistic regression, multiple discriminant analysis, artificial neural networks and support vector machines.

[1]  Hong Yan,et al.  Prototype optimization for nearest neighbor classifiers using a two-layer perceptron , 1993, Pattern Recognit..

[2]  Uri Lipowezky Selection of the optimal prototype subset for 1-NN classification , 1998, Pattern Recognit. Lett..

[3]  Lakhmi C. Jain,et al.  Nearest neighbor classifier: Simultaneous editing and feature selection , 1999, Pattern Recognit. Lett..

[4]  Jack Sklansky,et al.  A note on genetic algorithms for large-scale feature selection , 1989, Pattern Recognition Letters.

[5]  W. Eric L. Grimson,et al.  Prototype optimization for nearest-neighbor classification , 2002, Pattern Recognit..

[6]  Pedro M. Domingos Control-Sensitive Feature Selection for Lazy Learners , 1997, Artificial Intelligence Review.

[7]  Filiberto Pla,et al.  Prototype selection for the nearest neighbour rule through proximity graphs , 1997, Pattern Recognit. Lett..

[8]  Ingoo Han,et al.  Case-based reasoning supported by genetic algorithms for corporate bond rating , 1999 .

[9]  C. R. Mount,et al.  A case-based reasoning system for identifying failure mechanisms , 2000 .

[10]  David B. Skalak,et al.  Prototype and Feature Selection by Sampling and Random Mutation Hill Climbing Algorithms , 1994, ICML.

[11]  Claire Cardie,et al.  Improving Minority Class Prediction Using Case-Specific Feature Weights , 1997, ICML.

[12]  T. Ravindra Babu,et al.  Comparison of genetic algorithm based prototype selection schemes , 2001, Pattern Recognit..

[13]  F. Tay,et al.  Application of support vector machines in financial time series forecasting , 2001 .

[14]  Zbigniew Michalewicz,et al.  Genetic algorithms + data structures = evolution programs (3rd ed.) , 1996 .

[15]  Janet L. Kolodner,et al.  Case-Based Reasoning , 1989, IJCAI 1989.

[16]  Hans-Peter Kriegel,et al.  Feature Weighting and Instance Selection for Collaborative Filtering: An Information-Theoretic Approach* , 2003, Knowledge and Information Systems.

[17]  C. G. Hilborn,et al.  The Condensed Nearest Neighbor Rule , 1967 .

[18]  Peter E. Hart,et al.  The condensed nearest neighbor rule (Corresp.) , 1968, IEEE Trans. Inf. Theory.

[19]  Lawrence Davis,et al.  Hybridizing the Genetic Algorithm and the K Nearest Neighbors Classification Algorithm , 1991, ICGA.

[20]  Ingoo Han,et al.  Maintaining case-based reasoning systems using a genetic algorithms approach , 2001, Expert Syst. Appl..

[21]  Cheng Wu,et al.  A genetic learning approach with case-based memory for job-shop scheduling problems , 2002, Proceedings. International Conference on Machine Learning and Cybernetics.

[22]  Steven H. Kim,et al.  Identifying the Impact of Decision Variables for Nonlinear Classification Tasks , 2000 .

[23]  Dennis L. Wilson,et al.  Asymptotic Properties of Nearest Neighbor Rules Using Edited Data , 1972, IEEE Trans. Syst. Man Cybern..

[24]  Kyoung-jae Kim,et al.  Toward Global Optimization of Case-Based Reasoning Systems for Financial Forecasting , 2004, Applied Intelligence.

[25]  N. Ishii,et al.  A method of similarity metrics for structured representations , 1997 .

[26]  Kyung-shik Shin,et al.  A genetic algorithm application in bankruptcy prediction modeling , 2002, Expert Syst. Appl..

[27]  Claire Cardie,et al.  Using Decision Trees to Improve Case-Based Learning , 1993, ICML.

[28]  Miroslav Kubat,et al.  Selecting representative examples and attributes by a genetic algorithm , 2003, Intell. Data Anal..

[29]  Nick Lord,et al.  Statistical methods for business and economics , 1970 .

[30]  Chaochang Chiu,et al.  A case-based customer classification approach for direct marketing , 2002, Expert Syst. Appl..

[31]  David W. Aha,et al.  A Review and Empirical Evaluation of Feature Weighting Methods for a Class of Lazy Learning Algorithms , 1997, Artificial Intelligence Review.

[32]  Pei-Chann Chang,et al.  A case-based expert support system for due-date assignment in a wafer fabrication factory , 2003, J. Intell. Manuf..