Evolutionary selection of hyperrectangles in nested generalized exemplar learning

The nested generalized exemplar theory accomplishes learning by storing objects in Euclidean n-space, as hyperrectangles. Classification of new data is performed by computing their distance to the nearest ''generalized exemplar'' or hyperrectangle. This learning method allows the combination of the distance-based classification with the axis-parallel rectangle representation employed in most of the rule-learning systems. In this paper, we propose the use of evolutionary algorithms to select the most influential hyperrectangles to obtain accurate and simple models in classification tasks. The proposal has been compared with the most representative models based on hyperrectangle learning; such as the BNGE, RISE, INNER, and SIA genetics based learning approach. Our approach is also very competitive with respect to classical rule induction algorithms such as C4.5Rules and RIPPER. The results have been contrasted through non-parametric statistical tests over multiple data sets and they indicate that our approach outperforms them in terms of accuracy requiring a lower number of hyperrectangles to be stored, thus obtaining simpler models than previous NGE approaches. Larger data sets have also been tackled with promising outcomes.

[1]  Jesús Alcalá-Fdez,et al.  KEEL Data-Mining Software Tool: Data Set Repository, Integration of Algorithms and Experimental Analysis Framework , 2011, J. Multiple Valued Log. Soft Comput..

[2]  Alex Alves Freitas,et al.  A Survey of Evolutionary Algorithms for Clustering , 2009, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[3]  Oscar Luaces,et al.  Inflating examples to obtain rules , 2003, Int. J. Intell. Syst..

[4]  Ibrahim Türkoglu,et al.  A hybrid method based on artificial immune system and k-NN algorithm for better prediction of protein cellular localization sites , 2009, Appl. Soft Comput..

[5]  Maria do Carmo Nicoletti,et al.  A version of the NGE model suitable for fuzzy domains , 2007, J. Intell. Fuzzy Syst..

[6]  Estevam R. Hruschka,et al.  Automatic Construction of Fuzzy Rule Bases: a further Investigation into two Alternative Inductive Approaches , 2008, J. Univers. Comput. Sci..

[7]  Estevam R. Hruschka,et al.  Transferring neural network based knowledge into an exemplar-based learner , 2007, Neural Computing and Applications.

[8]  Francisco Herrera,et al.  A study of statistical techniques and performance measures for genetics-based machine learning: accuracy and interpretability , 2009, Soft Comput..

[9]  Larry J. Eshelman,et al.  The CHC Adaptive Search Algorithm: How to Have Safe Search When Engaging in Nontraditional Genetic Recombination , 1990, FOGA.

[10]  Tony R. Martinez,et al.  Improved Heterogeneous Distance Functions , 1996, J. Artif. Intell. Res..

[11]  Francisco Herrera,et al.  Using evolutionary algorithms as instance selection for data reduction in KDD: an experimental study , 2003, IEEE Trans. Evol. Comput..

[12]  Kyoung-jae Kim,et al.  Bankruptcy prediction modeling with hybrid case-based reasoning and genetic algorithms approach , 2009, Appl. Soft Comput..

[13]  S. García,et al.  An Extension on "Statistical Comparisons of Classifiers over Multiple Data Sets" for all Pairwise Comparisons , 2008 .

[14]  Francisco Herrera,et al.  Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: Experimental analysis of power , 2010, Inf. Sci..

[15]  A. Rosenfeld,et al.  IEEE TRANSACTIONS ON SYSTEMS , MAN , AND CYBERNETICS , 2022 .

[16]  Pedro M. Domingos Unifying Instance-Based and Rule-Based Induction , 1996, Machine Learning.

[17]  D. Kibler,et al.  Instance-based learning algorithms , 2004, Machine Learning.

[18]  Francisco Herrera,et al.  IFS-CoCo: Instance and feature selection based on cooperative coevolution with nearest neighbor rule , 2010, Pattern Recognit..

[19]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[20]  Maria do Carmo Nicoletti,et al.  Evaluating the effects of distance metrics on a NGE-based system , 2004, 2004 IEEE International Conference on Systems, Man and Cybernetics (IEEE Cat. No.04CH37583).

[21]  Sebastián Ventura,et al.  Multiple Instance Learning with Multiple Objective Genetic Programming for Web Mining , 2011, Appl. Soft Comput..

[22]  María José del Jesús,et al.  KEEL: a software tool to assess evolutionary algorithms for data mining problems , 2008, Soft Comput..

[23]  Francisco Herrera,et al.  Enhancing the effectiveness and interpretability of decision tree and rule induction classifiers with evolutionary training set selection over imbalanced problems , 2009, Appl. Soft Comput..

[24]  Ian Witten,et al.  Data Mining , 2000 .

[25]  L. Darrell Whitley,et al.  Messy Genetic Algorithms for Subset Feature Selection , 1997, ICGA.

[26]  Francisco Herrera,et al.  A memetic algorithm for evolutionary prototype selection: A scaling up approach , 2008, Pattern Recognit..

[27]  Igor Kononenko,et al.  Machine Learning and Data Mining: Introduction to Principles and Algorithms , 2007 .

[28]  Xiangyang Wang,et al.  Feature selection based on rough sets and particle swarm optimization , 2007, Pattern Recognit. Lett..

[29]  Yingtao Jiang,et al.  Selecting critical clinical features for heart diseases diagnosis with a real-coded genetic algorithm , 2008, Appl. Soft Comput..

[30]  José Ranilla,et al.  F AN: Finding Accurate iNductions , 2002, Int. J. Hum. Comput. Stud..

[31]  Peter E. Hart,et al.  Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.

[32]  Magdalene Marinaki,et al.  Honey Bees Mating Optimization algorithm for financial classification problems , 2010, Appl. Soft Comput..

[33]  JOHANNES FÜRNKRANZ,et al.  Separate-and-Conquer Rule Learning , 1999, Artificial Intelligence Review.

[34]  Simon Kasif,et al.  Learning Nested Concept Classes with Limited Storage , 1991, IJCAI.

[35]  A. E. Eiben,et al.  Introduction to Evolutionary Computing , 2003, Natural Computing Series.

[36]  Leonardo Franco,et al.  Constructive Neural Networks , 2009, Constructive Neural Networks.

[37]  Dr. Alex A. Freitas Data Mining and Knowledge Discovery with Evolutionary Algorithms , 2002, Natural Computing Series.

[38]  Gilles Venturini,et al.  SIA: A Supervised Inductive Algorithm with Genetic Search for Learning Attributes based Concepts , 1993, ECML.

[39]  Francisco Herrera,et al.  A First Approach to Nearest Hyperrectangle Selection by Evolutionary Algorithms , 2009, 2009 Ninth International Conference on Intelligent Systems Design and Applications.

[40]  Tony R. Martinez,et al.  Reduction Techniques for Instance-Based Learning Algorithms , 2000, Machine Learning.

[41]  Dennis L. Wilson,et al.  Asymptotic Properties of Nearest Neighbor Rules Using Edited Data , 1972, IEEE Trans. Syst. Man Cybern..

[42]  André Carlos Ponce de Leon Ferreira de Carvalho,et al.  A hybrid case adaptation approach for case-based reasoning , 2008, Applied Intelligence.

[43]  Steven Salzberg,et al.  A Nearest Hyperrectangle Learning Method , 1991, Machine Learning.

[44]  Francisco Herrera,et al.  Evolutionary stratified training set selection for extracting classification rules with trade off precision-interpretability , 2007, Data Knowl. Eng..

[45]  Thomas G. Dietterich,et al.  An Experimental Comparison of the Nearest-Neighbor and Nearest-Hyperrectangle Algorithms , 1995, Machine Learning.

[46]  Maliha S. Nash,et al.  Handbook of Parametric and Nonparametric Statistical Procedures , 2001, Technometrics.

[47]  S. Chen,et al.  Fast and accurate feature selection using hybrid genetic strategies , 1999, Proceedings of the 1999 Congress on Evolutionary Computation-CEC99 (Cat. No. 99TH8406).

[48]  Maria do Carmo Nicoletti,et al.  Constructive Neural Network Algorithms for Feedforward Architectures Suitable for Classification Tasks , 2009, Constructive Neural Networks.

[49]  Ester Bernadó-Mansilla,et al.  Genetic-based machine learning systems are competitive for pattern recognition , 2008, Evol. Intell..