Towards better generalization in Pittsburgh learning classifier systems

Generalization ability of a classifier is an important issue for any classification task. This paper proposes a new evolutionary system, i.e., EDARIC, based on the Pittsburgh approach for evolutionary machine learning and classification. The new system uses a destructive approach that starts with large-sized rules and gradually decreases the sizes as evolution progresses. Unlike most previous works, EDARIC adopts an intelligent deletion mechanism, evolves a separate population for each class of a given problem and uses an ensemble system to classify unknown instances. These features help in avoiding over-fitting and class-imbalance problems, which are beneficial for improving generalization ability of a classification system. EDARIC also applies a rule post-processing step to exempt the evolution phase from the burden of tuning a large number of parameters. Experimental results on various benchmark classification problems reveal that EDARIC has better generalization ability in case of both standard and imbalanced datasets compared to many existing algorithms in the literature.

[1]  Ester Bernadó-Mansilla,et al.  Accuracy-Based Learning Classifier Systems: Models, Analysis and Applications to Classification Tasks , 2003, Evolutionary Computation.

[2]  Steven Guan,et al.  An incremental approach to genetic-algorithms-based classification , 2005, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[3]  Francisco Herrera,et al.  Genetics-Based Machine Learning for Rule Induction: State of the Art, Taxonomy, and Comparative Study , 2010, IEEE Transactions on Evolutionary Computation.

[4]  Guangzhe Fan,et al.  Classification tree analysis using TARGET , 2008, Comput. Stat. Data Anal..

[5]  Xin Yao,et al.  A novel evolutionary data mining algorithm with applications to churn prediction , 2003, IEEE Trans. Evol. Comput..

[6]  Cezary Z. Janikow,et al.  A knowledge-intensive genetic algorithm for supervised learning , 1993, Machine Learning.

[7]  Jaume Bacardit,et al.  Evolving Multiple Discretizations with Adaptive Intervals for a Pittsburgh Rule-Based Learning Classifier System , 2003, GECCO.

[8]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[9]  Steven Guan,et al.  Ordered incremental training with genetic algorithms , 2004, Int. J. Intell. Syst..

[10]  Jing Liu,et al.  An organizational coevolutionary algorithm for classification , 2006, IEEE Trans. Evol. Comput..

[11]  F. Wilcoxon Individual Comparisons by Ranking Methods , 1945 .

[12]  J. R. Quinlan,et al.  MDL and Categorical Theories (Continued) , 1995, ICML.

[13]  Blaz Zupan,et al.  Orange: From Experimental Machine Learning to Interactive Data Mining , 2004, PKDD.

[14]  Jaume Bacardit,et al.  Bloat Control and Generalization Pressure Using the Minimum Description Length Principle for a Pittsburgh Approach Learning Classifier System , 2005, IWLCS.

[15]  William W. Cohen Fast Effective Rule Induction , 1995, ICML.

[16]  Sandip Sen,et al.  Using real-valued genetic algorithms to evolve rule sets for classification , 1994, Proceedings of the First IEEE Conference on Evolutionary Computation. IEEE World Congress on Computational Intelligence.

[17]  Stephen F. Smith,et al.  Competition-based induction of decision models from examples , 1993, Machine Learning.

[18]  Deborah R. Carvalho,et al.  A hybrid decision tree/genetic algorithm method for data mining , 2004, Inf. Sci..

[19]  Stewart W. Wilson Classifier Fitness Based on Accuracy , 1995, Evolutionary Computation.

[20]  Jesús S. Aguilar-Ruiz,et al.  Natural Encoding for Evolutionary Supervised Learning , 2007, IEEE Transactions on Evolutionary Computation.

[21]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[22]  Gilles Venturini,et al.  SIA: A Supervised Inductive Algorithm with Genetic Search for Learning Attributes based Concepts , 1993, ECML.

[23]  David E. Goldberg,et al.  Facetwise Analysis of XCS for Problems With Class Imbalances , 2009, IEEE Transactions on Evolutionary Computation.

[24]  S. Griffis EDITOR , 1997, Journal of Navigation.

[25]  Kalyanmoy Deb,et al.  A Comparative Analysis of Selection Schemes Used in Genetic Algorithms , 1990, FOGA.

[26]  Ester Bernadó-Mansilla,et al.  Evolutionary rule-based systems for imbalanced data sets , 2008, Soft Comput..

[27]  Chandrika Kamath,et al.  Inducing oblique decision trees with evolutionary algorithms , 2003, IEEE Trans. Evol. Comput..

[28]  Kay Chen Tan,et al.  A coevolutionary algorithm for rules discovery in data mining , 2006, Int. J. Syst. Sci..

[29]  Nada Lavrac,et al.  The Multi-Purpose Incremental Learning System AQ15 and Its Testing Application to Three Medical Domains , 1986, AAAI.

[30]  Peter Clark,et al.  The CN2 Induction Algorithm , 1989, Machine Learning.