The cost-minimizing inverse classification problem: a genetic algorithm approach

Abstract We consider the inverse problem in classification systems described as follows. Given a set of prototype cases representing a set of categories, a similarity function, and a new case classified in some category, we find the cost-minimizing changes to the attribute values such that the case is reclassified as a member of a (different) preferred category. The problem is “inverse” because the usual mapping is from a case to its unknown category. The increased application of classification systems in business suggests that this inverse problem can be of significant benefit to decision makers as a form of sensitivity analysis. Analytic approaches to this inverse problem are difficult to formulate as the constraints are either not available or difficult to determine. To investigate this inverse problem, we develop several genetic algorithms and study their performance as problem difficulty increases. We develop a real genetic algorithm with feasibility control, a traditional binary genetic algorithm, and a steepest ascent hill climbing algorithm. In a series of simulation experiments, we compare the performance of these algorithms to the optimal solution as the problem difficulty increases (more attributes and classes). In addition, we analyze certain algorithm effects (level of feasibility control, operator design, and fitness function) to determine the best approach. Our results indicate the viability of the real genetic algorithm and the importance of feasibility control as the problem difficulty increases.

[1]  Francesco Ricci,et al.  Advanced metrics for class-driven similarity search , 1999, Proceedings. Tenth International Workshop on Database and Expert Systems Applications. DEXA 99.

[2]  Jim Antonisse,et al.  A New Interpretation of Schema Notation that Overtums the Binary Encoding Constraint , 1989, ICGA.

[3]  Hyun Myung,et al.  Evolutionary programming techniques for constrained optimization problems , 1997, IEEE Trans. Evol. Comput..

[4]  John J. Grefenstette,et al.  Optimization of Control Parameters for Genetic Algorithms , 1986, IEEE Transactions on Systems, Man, and Cybernetics.

[5]  Katta G. Murty,et al.  Exterior point algorithms for nearest points and convex quadratic programs , 1992, Math. Program..

[6]  Ray Bareiss,et al.  Concept Learning and Heuristic Classification in WeakTtheory Domains , 1990, Artif. Intell..

[7]  David Avis,et al.  A pivoting algorithm for convex hulls and vertex enumeration of arrangements and polyhedra , 1991, SCG '91.

[8]  Zbigniew Michalewicz,et al.  A Survey of Constraint Handling Techniques in Evolutionary Computation Methods , 1995 .

[9]  ZakarauskasPierre,et al.  Complexity Analysis for Partitioning Nearest Neighbor Searching Algorithms , 1996 .

[10]  Khaled S. Al-Sultan,et al.  A Tabu search approach to the clustering problem , 1995, Pattern Recognit..

[11]  Steven Orla Kimbrough,et al.  On automating candle lighting analysis: insight from search with genetic algorithms and approximate models , 1994, 1994 Proceedings of the Twenty-Seventh Hawaii International Conference on System Sciences.

[12]  Herbert Edelsbrunner,et al.  Algorithms in Combinatorial Geometry , 1987, EATCS Monographs in Theoretical Computer Science.

[13]  Zbigniew Michalewicz,et al.  Genetic Algorithms + Data Structures = Evolution Programs , 1996, Springer Berlin Heidelberg.

[14]  Shokri Z. Selim,et al.  A global algorithm for the fuzzy clustering problem , 1993, Pattern Recognit..

[15]  Shokri Z. Selim,et al.  A simulated annealing algorithm for the clustering problem , 1991, Pattern Recognit..

[16]  John J. Grefenstette,et al.  Proceedings of the 1st International Conference on Genetic Algorithms , 1985 .

[17]  Marc Goodman,et al.  Prism: A Case-Based Telex Classifier , 1990, IAAI.

[18]  Philip Boyland Guide to standard Mathematica packages , 1991 .

[19]  Jens Gottlieb,et al.  Evolutionary algorithms for constrained optimization problems , 2000, Berichte aus der Informatik.

[20]  David Avis A C Implementation of the Reverse Search Vertex Enumeration Algorithm(Computational Geometry and Discrete Geometry) , 1994 .

[21]  Melody Y. Kiang,et al.  Predicting Bank Failures: A neural network approach , 1990, Appl. Artif. Intell..

[22]  Franz Aurenhammer,et al.  Voronoi diagrams—a survey of a fundamental geometric data structure , 1991, CSUR.

[23]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[24]  V. Barnett,et al.  Applied Linear Statistical Models , 1975 .

[25]  John J. Grefenstette,et al.  How Genetic Algorithms Work: A Critical Look at Implicit Parallelism , 1989, ICGA.

[26]  Zbigniew Michalewicz,et al.  Evolutionary Algorithms for Constrained Parameter Optimization Problems , 1996, Evolutionary Computation.

[27]  Katta G. Murty,et al.  Linear complementarity, linear and nonlinear programming , 1988 .

[28]  A. Charnes,et al.  SENSITIVITY OF EFFICIENCY CLASSIFICATIONS IN THE ADDITIVE MODEL OF DATA ENVELOPMENT ANALYSIS , 1992 .

[29]  Belur V. Dasarathy,et al.  Nearest neighbor (NN) norms: NN pattern classification techniques , 1991 .

[30]  Fred Glover,et al.  Tabu Search - Part II , 1989, INFORMS J. Comput..

[31]  D. E. Goldberg,et al.  Genetic Algorithms in Search , 1989 .

[32]  Jie Cheng,et al.  Applying machine learning to semiconductor manufacturing , 1993, IEEE Expert.

[33]  Bradley P. Allen,et al.  Case-based reasoning: business applications , 1994, CACM.

[34]  David E. Goldberg,et al.  Sizing Populations for Serial and Parallel Genetic Algorithms , 1989, ICGA.

[35]  David L. Waltz,et al.  Toward memory-based reasoning , 1986, CACM.

[36]  Alden H. Wright,et al.  Genetic Algorithms for Real Parameter Optimization , 1990, FOGA.

[37]  D. Avis A Revised Implementation of the Reverse Search Vertex Enumeration Algorithm , 2000 .

[38]  Michael H. Kutner Applied Linear Statistical Models , 1974 .

[39]  John M. Ozard,et al.  Complexity Analysis for Partitioning Nearest Neighbor Searching Algorithms , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[40]  Fred W. Glover,et al.  Tabu Search - Part I , 1989, INFORMS J. Comput..

[41]  Katta G. Murty,et al.  Nearest point problems: theory and algorithms , 1990 .

[42]  Zbigniew Michalewicz,et al.  A Nonstandard Genetic Algorithm for the Nonlinear Transportation Problem , 1991, INFORMS J. Comput..

[43]  Witold Pedrycz,et al.  Prototype construction and evaluation as inverse problems in pattern classification , 1992, Pattern Recognit..

[44]  David L. Waltz,et al.  Trading MIPS and memory for knowledge engineering , 1992, CACM.

[45]  Robert W. Blanning,et al.  An empirical measure of element contribution in neural networks , 1998, IEEE Trans. Syst. Man Cybern. Part C.

[46]  Khaled S. Al-Sultan A Newton Based Radius Reduction Algorithm for Nearest Point Problems in Pos Cones , 1994, INFORMS J. Comput..

[47]  Evangelos Simoudis,et al.  Using case-based retrieval for customer technical support , 1992, IEEE Expert.

[48]  Robert Tibshirani,et al.  Discriminant Adaptive Nearest Neighbor Classification , 1995, IEEE Trans. Pattern Anal. Mach. Intell..