An enhanced XCS rule discovery module using feature ranking

XCS is a genetics-based machine learning model that combines reinforcement learning with evolutionary algorithms to evolve a population of classifiers in the form of condition-action rules. Like many other machine learning algorithms, XCS is less effective on high-dimensional data sets. In this paper, we describe a new guided rule discovery mechanisms for XCS, inspired by feature selection techniques commonly used in machine learning. In our approach, feature quality information is used to bias the evolutionary operators. A comprehensive set of experiments is used to investigate how the number of features used to bias the evolutionary operators, population size, and feature ranking technique, affect model performance. Numerical simulations have shown that our guided rule discovery mechanism improves the performance of XCS in terms of accuracy, execution time and more generally in terms of classifier diversity in the population, especially for high-dimensional classification problems. We present a detailed discussion of the effects of model parameters and recommend settings for large scale problems.

[1]  Igor Kononenko,et al.  Estimating Attributes: Analysis and Extensions of RELIEF , 1994, ECML.

[2]  Jacek Blazewicz,et al.  Coordination number prediction using learning classifier systems: performance and interpretability , 2006, GECCO '06.

[3]  Xavier Llorà,et al.  Automated alphabet reduction method with evolutionary algorithms for protein structure prediction , 2007, GECCO '07.

[4]  Michael Kirley,et al.  Guided Rule Discovery in XCS for High-Dimensional Classification Problems , 2011, Australasian Conference on Artificial Intelligence.

[5]  Michael Kirley,et al.  A multiple population XCS: Evolving condition-action rules based on feature space partitions , 2010, IEEE Congress on Evolutionary Computation.

[6]  Ian H. Witten,et al.  Data mining - practical machine learning tools and techniques, Second Edition , 2005, The Morgan Kaufmann series in data management systems.

[7]  Ester Bernadó-Mansilla,et al.  New Crossover Operator for Evolutionary Rule Discovery in XCS , 2008, 2008 Eighth International Conference on Hybrid Intelligent Systems.

[8]  Jaume Bacardit,et al.  Smart crossover operator with multiple parents for a Pittsburgh learning classifier system , 2006, GECCO '06.

[9]  John C. Platt,et al.  Fast training of support vector machines using sequential minimal optimization, advances in kernel methods , 1999 .

[10]  Martin V. Butz,et al.  An algorithmic description of XCS , 2000, Soft Comput..

[11]  Ester Bernadó-Mansilla,et al.  Genetic-based machine learning systems are competitive for pattern recognition , 2008, Evol. Intell..

[12]  Daniel T. Larose,et al.  Discovering Knowledge in Data: An Introduction to Data Mining , 2005 .

[13]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[14]  Francisco Herrera,et al.  Genetics-Based Machine Learning for Rule Induction: State of the Art, Taxonomy, and Comparative Study , 2010, IEEE Transactions on Evolutionary Computation.

[15]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[16]  Aixia Guo,et al.  Gene Selection for Cancer Classification using Support Vector Machines , 2014 .

[17]  อนิรุธ สืบสิงห์,et al.  Data Mining Practical Machine Learning Tools and Techniques , 2014 .

[18]  Masoud Nikravesh,et al.  Feature Extraction - Foundations and Applications , 2006, Feature Extraction.

[19]  Pier Luca Lanzi,et al.  A Study of the Generalization Capabilities of XCS , 1997, ICGA.

[20]  Stewart W. Wilson Get Real! XCS with Continuous-Valued Inputs , 1999, Learning Classifier Systems.

[21]  Vipin Kumar,et al.  Introduction to Data Mining , 2022, Data Mining and Machine Learning Applications.

[22]  E. Lander,et al.  Gene expression correlates of clinical prostate cancer behavior. , 2002, Cancer cell.

[23]  Martin V. Butz,et al.  An Algorithmic Description of XCS , 2000, IWLCS.

[24]  Ester Bernadó-Mansilla,et al.  Analysis and improvement of the genetic discovery component of XCS , 2009, Int. J. Hybrid Intell. Syst..

[25]  Stewart W. Wilson Classifier Fitness Based on Accuracy , 1995, Evolutionary Computation.

[26]  Martin V. Butz,et al.  Automated Global Structure Extraction for Effective Local Building Block Processing in XCS , 2006, Evolutionary Computation.

[27]  S. N. Sivanandam,et al.  Introduction to Data Mining and its Applications , 2006, Studies in Computational Intelligence.

[28]  Luis M. San José-Revuelta,et al.  A Hybrid GA-TS Technique with Dynamic Operators and its Application to Channel Equalization and Fiber Tracking , 2008 .

[29]  Martin V. Butz,et al.  Analysis and Improvement of Fitness Exploitation in XCS: Bounding Models, Tournament Selection, and Bilateral Accuracy , 2003, Evolutionary Computation.

[30]  U. Alon,et al.  Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[31]  J. Sudbø,et al.  Gene-expression profiles in hereditary breast cancer. , 2001, The New England journal of medicine.

[32]  Jose Crispin Hernandez Hernandez,et al.  A New Combined Filter-Wrapper Framework for Gene Subset Selection with Specialized Genetic Operators , 2010, MCPR.

[33]  Fang-Xiang Wu,et al.  On Determination of Minimum Sample Size for Discovery of Temporal Gene Expression Patterns , 2006, First International Multi-Symposiums on Computer and Computational Sciences (IMSCCS'06).

[34]  Jason H. Moore,et al.  Exploiting Expert Knowledge in Genetic Programming for Genome-Wide Genetic Analysis , 2006, PPSN.

[35]  Concha Bielza,et al.  Machine Learning in Bioinformatics , 2008, Encyclopedia of Database Systems.

[36]  Raymond Chiong,et al.  Novel evolutionary algorithms for supervised classification problems: an experimental study , 2011, Evol. Intell..

[37]  Martin V. Butz,et al.  On the scalability of XCS(F) , 2009, GECCO '09.

[38]  S. Sumathi,et al.  Introduction to Data Mining and its Applications (Studies in Computational Intelligence) , 2006 .