Genetics-Based Machine Learning for Rule Induction : Taxonomy , Experimental Study and State of the Art

The classification problem can be addressed by numerous techniques and algorithms, which belong to different paradigms of Machine Learning. In this work, we are interested in evolutionary algorithms, the so-called Genetics-Based Machine Learning algorithms. In particular, we will focus on evolutionary approaches that evolve a set of rules, i.e., evolutionary rule-based systems, applied to classification tasks, in order to provide a stateof-the-art in this field. This study has been done with a double aim: to present a taxonomy of the Genetics-Based Machine Learning approaches for rule induction, and to develop an empirical analysis both for standard classification and for classification with imbalanced data sets. We also include a comparative study of the GBML methods with some classical non-evolutionary algorithms, in order to observe the suitability and high power of the search performed by evolutionary algorithms and the behaviour for the GBML algorithms in contrast to the classical approaches, in terms of classification accuracy.

[1]  María José del Jesús,et al.  KEEL: a software tool to assess evolutionary algorithms for data mining problems , 2008, Soft Comput..

[2]  Jesús S. Aguilar-Ruiz,et al.  Natural Encoding for Evolutionary Supervised Learning , 2007, IEEE Transactions on Evolutionary Computation.

[3]  Andrew K. C. Wong,et al.  Class-Dependent Discretization for Inductive Learning from Continuous and Mixed-Mode Data , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  Stephen F. Smith,et al.  Flexible Learning of Problem Solving Heuristics Through Adaptive Search , 1983, IJCAI.

[5]  Adaptation , 1926 .

[6]  S. García,et al.  An Extension on "Statistical Comparisons of Classifiers over Multiple Data Sets" for all Pairwise Comparisons , 2008 .

[7]  Taeho Jo,et al.  A Multiple Resampling Method for Learning from Imbalanced Data Sets , 2004, Comput. Intell..

[8]  Alexandre Parodi,et al.  An Efficient Classifier System and Its Experimental Comparison with Two Representative Learning Methods on Three Medical Domains , 1991, ICGA.

[9]  Wei-Yin Loh,et al.  A Comparison of Prediction Accuracy, Complexity, and Training Time of Thirty-Three Old and New Classification Algorithms , 2000, Machine Learning.

[10]  Ronald L. Rivest,et al.  Learning decision lists , 2004, Machine Learning.

[11]  Kenneth A. De Jong,et al.  Learning Concept Classification Rules Using Genetic Algorithms , 1991, IJCAI.

[12]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[13]  Maliha S. Nash,et al.  Handbook of Parametric and Nonparametric Statistical Procedures , 2001, Technometrics.

[14]  Guangzhe Fan,et al.  Classification tree analysis using TARGET , 2008, Comput. Stat. Data Anal..

[15]  Cezary Z. Janikow,et al.  A knowledge-intensive genetic algorithm for supervised learning , 1993, Machine Learning.

[16]  Foster J. Provost,et al.  A Survey of Methods for Scaling Up Inductive Algorithms , 1999, Data Mining and Knowledge Discovery.

[17]  K. De Jong,et al.  Using Genetic Algorithms for Concept Learning , 2004, Machine Learning.

[18]  José Hernández-Orallo,et al.  An experimental comparison of performance measures for classification , 2009, Pattern Recognit. Lett..

[19]  Mehmed Kantardzic,et al.  Learning from Data , 2011 .

[20]  Jonathan L. Shapiro,et al.  Genetic Algorithms in Machine Learning , 2001, Machine Learning and Its Applications.

[21]  José Martínez Sotoca,et al.  An analysis of how training data complexity affects the nearest neighbor classifiers , 2007, Pattern Analysis and Applications.

[22]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[23]  Francisco Herrera,et al.  Genetic fuzzy systems: taxonomy, current research trends and prospects , 2008, Evol. Intell..

[24]  Xin Yao,et al.  A novel evolutionary data mining algorithm with applications to churn prediction , 2003, IEEE Trans. Evol. Comput..

[25]  Simon Kasif,et al.  Induction of Oblique Decision Trees , 1993, IJCAI.

[26]  John H. Holland,et al.  Escaping brittleness: the possibilities of general-purpose learning algorithms applied to parallel rule-based systems , 1995 .

[27]  Weimin Xiao,et al.  Evolving accurate and compact classification rules with gene expression programming , 2003, IEEE Trans. Evol. Comput..

[28]  Ester Bernadó-Mansilla,et al.  Accuracy-Based Learning Classifier Systems: Models, Analysis and Applications to Classification Tasks , 2003, Evolutionary Computation.

[29]  Xavier Llorà,et al.  XCS and GALE: A Comparative Study of Two Learning Classifier Systems on Data Mining , 2001, IWLCS.

[30]  T. Ho,et al.  Data Complexity in Pattern Recognition , 2006 .

[31]  Robert P. W. Duin,et al.  Efficient Multiclass ROC Approximation by Decomposition via Confusion Matrix Perturbation Analysis , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32]  F. Wilcoxon Individual Comparisons by Ranking Methods , 1945 .

[33]  Cosimo Anglano,et al.  NOW G-Net: learning classification programs on networks of workstations , 2002, IEEE Trans. Evol. Comput..

[34]  Deborah R. Carvalho,et al.  A hybrid decision tree/genetic algorithm method for data mining , 2004, Inf. Sci..

[35]  Xindong Wu,et al.  10 Challenging Problems in Data Mining Research , 2006, Int. J. Inf. Technol. Decis. Mak..

[36]  John H. Holland,et al.  Cognitive systems based on adaptive algorithms , 1977, SGAR.

[37]  William W. Cohen Fast Effective Rule Induction , 1995, ICML.

[38]  Gustavo E. A. P. A. Batista,et al.  A study of the behavior of several methods for balancing machine learning training data , 2004, SKDD.

[39]  Arie Ben-David,et al.  A lot of randomness is hiding in accuracy , 2007, Eng. Appl. Artif. Intell..

[40]  Tin Kam Ho,et al.  On classifier domains of competence , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[41]  José Salvador Sánchez,et al.  On the k-NN performance in a challenging scenario of imbalance and overlapping , 2008, Pattern Analysis and Applications.

[42]  Kay Chen Tan,et al.  A coevolutionary algorithm for rules discovery in data mining , 2006, Int. J. Syst. Sci..

[43]  Martin V. Butz,et al.  Toward a theory of generalization and learning in XCS , 2004, IEEE Transactions on Evolutionary Computation.

[44]  Martin V. Butz,et al.  Data Mining in Learning Classifier Systems: Comparing XCS with GAssist , 2005, IWLCS.

[45]  Nitesh V. Chawla,et al.  Editorial: special issue on learning from imbalanced data sets , 2004, SKDD.

[46]  Jaume Bacardit,et al.  Evolving Multiple Discretizations with Adaptive Intervals for a Pittsburgh Rule-Based Learning Classifier System , 2003, GECCO.

[47]  Kwong-Sak Leung,et al.  Data Mining Using Grammar Based Genetic Programming and Applications , 2000 .

[48]  Francisco Herrera,et al.  A study of statistical techniques and performance measures for genetics-based machine learning: accuracy and interpretability , 2009, Soft Comput..

[49]  Pat Langley,et al.  Estimating Continuous Distributions in Bayesian Classifiers , 1995, UAI.

[50]  Filippo Neri,et al.  Search-Intensive Concept Induction , 1995, Evolutionary Computation.

[51]  D.E. Goldberg,et al.  Classifier Systems and Genetic Algorithms , 1989, Artif. Intell..

[52]  Peter Clark,et al.  The CN2 Induction Algorithm , 1989, Machine Learning.

[53]  Alberto Maria Segre,et al.  Programs for Machine Learning , 1994 .

[54]  A. E. Eiben,et al.  Introduction to Evolutionary Computing , 2003, Natural Computing Series.

[55]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[56]  Stewart W. Wilson Generalization in the XCS Classifier System , 1998 .

[57]  Stewart W. Wilson Classifier Fitness Based on Accuracy , 1995, Evolutionary Computation.

[58]  Charles X. Ling,et al.  Using AUC and accuracy in evaluating learning algorithms , 2005, IEEE Transactions on Knowledge and Data Engineering.

[59]  D. Sheskin Handbook of parametric and nonparametric statistical procedures, 2nd ed. , 2000 .

[60]  Nada Lavrac,et al.  The Multi-Purpose Incremental Learning System AQ15 and Its Testing Application to Three Medical Domains , 1986, AAAI.

[61]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[62]  R. Barandelaa,et al.  Strategies for learning in class imbalance problems , 2003, Pattern Recognit..

[63]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[64]  Federico Divina,et al.  Experimental Evaluation of Discretization Schemes for Rule Induction , 2004, GECCO.

[65]  Steven Guan,et al.  An incremental approach to genetic-algorithms-based classification , 2005, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[66]  Jacob Cohen A Coefficient of Agreement for Nominal Scales , 1960 .

[67]  Simon Kasif,et al.  A System for Induction of Oblique Decision Trees , 1994, J. Artif. Intell. Res..

[68]  Foster J. Provost,et al.  Learning When Training Data are Costly: The Effect of Class Distribution on Tree Induction , 2003, J. Artif. Intell. Res..

[69]  Chandrika Kamath,et al.  Inducing oblique decision trees with evolutionary algorithms , 2003, IEEE Trans. Evol. Comput..

[70]  Tin Kam Ho,et al.  Domain of competence of XCS classifier system in complexity measurement space , 2005, IEEE Transactions on Evolutionary Computation.

[71]  Tom Fawcett,et al.  Adaptive Fraud Detection , 1997, Data Mining and Knowledge Discovery.

[72]  Jesús Alcalá-Fdez,et al.  Implementation and Integration of Algorithms into the KEEL Data-Mining Software Tool , 2009, IDEAL.

[73]  Robert A. Lordo,et al.  Learning from Data: Concepts, Theory, and Methods , 2001, Technometrics.

[74]  Stephen F. Smith,et al.  Competition-based induction of decision models from examples , 1993, Machine Learning.

[75]  Sandip Sen,et al.  Using real-valued genetic algorithms to evolve rule sets for classification , 1994, Proceedings of the First IEEE Conference on Evolutionary Computation. IEEE World Congress on Computational Intelligence.

[76]  King-Sun Fu,et al.  A Nonparametric Partitioning Procedure for Pattern Classification , 1969, IEEE Transactions on Computers.

[77]  Miguel Toro,et al.  Evolutionary learning of hierarchical decision rules , 2003, IEEE Trans. Syst. Man Cybern. Part B.

[78]  B. Ripley,et al.  Pattern Recognition , 1968, Nature.

[79]  Jing Liu,et al.  An organizational coevolutionary algorithm for classification , 2006, IEEE Trans. Evol. Comput..

[80]  J. R. Quinlan,et al.  MDL and Categorical Theories (Continued) , 1995, ICML.

[81]  W. Youden,et al.  Index for rating diagnostic tests , 1950, Cancer.

[82]  Jaume Bacardit,et al.  Bloat Control and Generalization Pressure Using the Minimum Description Length Principle for a Pittsburgh Approach Learning Classifier System , 2005, IWLCS.

[83]  D. Kibler,et al.  Instance-based learning algorithms , 2004, Machine Learning.

[84]  Ester Bernadó-Mansilla,et al.  Evolutionary rule-based systems for imbalanced data sets , 2008, Soft Comput..

[85]  J. Rissanen,et al.  Modeling By Shortest Data Description* , 1978, Autom..

[86]  Steven Guan,et al.  Ordered incremental training with genetic algorithms , 2004, Int. J. Intell. Syst..

[87]  Hewijin Christine Jiau,et al.  Evaluation of neural networks and data mining methods on a credit assessment task for class imbalance problem , 2006 .

[88]  J. Shaffer Modified Sequentially Rejective Multiple Test Procedures , 1986 .

[89]  Gilles Venturini,et al.  SIA: A Supervised Inductive Algorithm with Genetic Search for Learning Attributes based Concepts , 1993, ECML.

[90]  Stewart W. Wilson Get Real! XCS with Continuous-Valued Inputs , 1999, Learning Classifier Systems.

[91]  Dr. Alex A. Freitas Data Mining and Knowledge Discovery with Evolutionary Algorithms , 2002, Natural Computing Series.

[92]  Stan Szpakowicz,et al.  Beyond Accuracy, F-Score and ROC: A Family of Discriminant Measures for Performance Evaluation , 2006, Australian Conference on Artificial Intelligence.

[93]  S. Smith,et al.  A Learning System Based on Genetic Algorithms , 1980 .

[94]  JOHANNES FÜRNKRANZ,et al.  Separate-and-Conquer Rule Learning , 1999, Artificial Intelligence Review.

[95]  Martin V. Butz,et al.  Improving the Performance of a Pittsburgh Learning Classifier System Using a Default Rule , 2005, IWLCS.

[96]  Jacek M. Zurada,et al.  Training neural network classifiers for medical decision making: The effects of imbalanced datasets on classification performance , 2008, Neural Networks.

[97]  Richard Baumgartner,et al.  Data complexity assessment in undersampled classification of high-dimensional biomedical data , 2006, Pattern Recognit. Lett..