Comparison of genetic algorithm based prototype selection schemes

1. IntroductionPrototype selection is the process of "nding represen-tative patterns from the data. Representative patternshelp in reducing the data on which further operationssuch as data mining can be carried out. The currentwork discusses computation of prototypes usingmedoids [1], leaders [2] and distance based thres-holds. After "nding the initial set of prototypes, theoptimal set is found by means of genetic algorithms(GAs). A comparison of stochastic search algorithms iscarriedout by SusheelaDeviand NarasimhaMurty [3].They conclude that performance of genetic algorithmsis the best among the search algorithms. Chang andLipmann [4] suggest the use of genetic algorithms forpattern classi"cation.In the following sections, we discuss and comparevarious prototype selection methods under considera-tion. Comparison of results are based on nearest neigh-bor classi"er (NNC). Subsequently, considering thoseprototype sets which provided good classi"cation accu-racy, GAs are used for optimal prototype selection.Based on the natureof the data characteristicsa numberofexperimentsbasedonGAsarecarriedout.Asummaryof results is presented.2. Description of dataHandwritten digit data [5] is used for the comparisonexercises. The training data consists of 667 patterns foreach class of digits 0}9, totalling to 6670 patterns. Thetest data consists of 3333 patterns. While carrying out