A Boosting-Based Prototype Weighting and Selection Scheme

Prototype Selection (PS), i.e., search for relevant subsets of instances, is an interesting Data Mining problem. Original studies of Hart and Gates consisted in producing stepwise a Co~e’csed or Reduced set of prototypes, evaluated using the accuracy of a Nearest Neighbor rule. We present in this paper a new approach to PS. It is inspired by a recent cl~m~ification technique known as Boosting, whose ideas were previously unused in that field. Three interesting properties emerge from our adaptation. First, the accuracy, which was the standard in PS since Hart and Gates, is no longer the reliability criterion. Second, PS interacts with a prototype ~eightis~g scheme, i.e., each prototype receives periodically a real confidence, its significance, with respect to the currently selected set. Finally, Boosting as used in PS allows to obtain an algorithm whose time complexity compares favorably with classical PS algorithms. Experiments lead to the following conclusion: the output of the algorithm on fourteen benchmarks is often more accurate than those of three state-of-the-art PS algorithms.