Domain of competence of XCS classifier system in complexity measurement space

The XCS classifier system has recently shown a high degree of competence on a variety of data mining problems, but to what kind of problems XCS is well and poorly suited is seldom understood, especially for real-world classification problems. The major inconvenience has been attributed to the difficulty of determining the intrinsic characteristics of real-world classification problems. This paper investigates the domain of competence of XCS by means of a methodology that characterizes the complexity of a classification problem by a set of geometrical descriptors. In a study of 392 classification problems along with their complexity characterization, we are able to identify difficult and easy domains for XCS. We focus on XCS with hyperrectangle codification, which has been predominantly used for real-attributed domains. The results show high correlations between XCS's performance and measures of length of class boundaries, compactness of classes, and nonlinearities of decision boundaries. We also compare the relative performance of XCS with other traditional classifier schemes. Besides confirming the high degree of competence of XCS in these problems, we are able to relate the behavior of the different classifier schemes to the geometrical complexity of the problem. Moreover, the results highlight certain regions of the complexity measurement space where a classifier scheme excels, establishing a first step toward determining the best classifier scheme for a given classification problem.

[1]  Stewart W. Wilson,et al.  Learning Classifier Systems, From Foundations to Applications , 2000 .

[2]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[3]  R. Prim Shortest connection networks and some generalizations , 1957 .

[4]  John H. Holland,et al.  Cognitive systems based on adaptive algorithms , 1977, SGAR.

[5]  Frank Lebourgeois,et al.  Pretopological approach for supervised learning , 1996, ICPR.

[6]  T. Kovacs Deletion schemes for classifier systems , 1999 .

[7]  FRED W. SMITH,et al.  Pattern Classifier Design by Linear Programming , 1968, IEEE Transactions on Computers.

[8]  David E. Goldberg,et al.  Genetic Algorithms with Sharing for Multimodalfunction Optimization , 1987, ICGA.

[9]  Stewart W. Wilson Mining Oblique Data with XCS , 2000, IWLCS.

[10]  Sean Saxon,et al.  XCS and the Monk's Problems , 1999, Learning Classifier Systems.

[11]  Brian W. Kernighan,et al.  AMPL: A Modeling Language for Mathematical Programming , 1993 .

[12]  Tin Kam Ho,et al.  A Data Complexity Analysis of Comparative Advantages of Decision Forest Constructors , 2002, Pattern Analysis & Applications.

[13]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[14]  Christopher Stone,et al.  For Real! XCS with Continuous-Valued Inputs , 2003, Evolutionary Computation.

[15]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[16]  Ester Bernadó-Mansilla,et al.  Accuracy-Based Learning Classifier Systems: Models, Analysis and Applications to Classification Tasks , 2003, Evolutionary Computation.

[17]  Stewart W. Wilson Generalization in the XCS Classifier System , 1998 .

[18]  Stewart W. Wilson Classifier Fitness Based on Accuracy , 1995, Evolutionary Computation.

[19]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[20]  David W. Aha,et al.  Instance-Based Learning Algorithms , 1991, Machine Learning.

[21]  Carla E. Brodley,et al.  Recursive automatic bias selection for classifier construction , 1995, Machine Learning.

[22]  Tin Kam Ho,et al.  Measuring the complexity of classification problems , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[23]  Luca Lanzi Pier,et al.  Extending the Representation of Classifier Conditions Part II: From Messy Coding to S-Expressions , 1999 .

[24]  Tim Kovacs,et al.  Strength or Accuracy? Fitness Calculation in Learning Classifier Systems , 1999, Learning Classifier Systems.

[25]  Martin V. Butz,et al.  An algorithmic description of XCS , 2000, Soft Comput..

[26]  John H. Holland,et al.  COGNITIVE SYSTEMS BASED ON ADAPTIVE ALGORITHMS1 , 1978 .

[27]  Tin Kam Ho,et al.  The learning behavior of single neuron classifiers on linearly separable or nonseparable input , 1999, IJCNN'99. International Joint Conference on Neural Networks. Proceedings (Cat. No.99CH36339).

[28]  Robert P. W. Duin,et al.  On the nonlinearity of pattern classifiers , 1996, Proceedings of 13th International Conference on Pattern Recognition.

[29]  M. Pelikán,et al.  Analyzing the evolutionary pressures in XCS , 2001 .

[30]  Pier Luca Lanzi,et al.  An Analysis of Generalization in the XCS Classifier System , 1999, Evolutionary Computation.

[31]  Tin Kam Ho,et al.  The Random Subspace Method for Constructing Decision Forests , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[32]  Xavier Llorà,et al.  XCS and GALE: A Comparative Study of Two Learning Classifier Systems on Data Mining , 2001, IWLCS.

[33]  Martin V. Butz,et al.  How XCS evolves accurate classifiers , 2001 .

[34]  John H. Holland,et al.  Escaping brittleness: the possibilities of general-purpose learning algorithms applied to parallel rule-based systems , 1995 .

[35]  David E. Goldberg,et al.  Implicit Niching in a Learning Classifier System: Nature's Way , 1994, Evolutionary Computation.

[36]  Tin Kam Ho,et al.  Complexity Measures of Supervised Classification Problems , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[37]  Simon Kasif,et al.  A System for Induction of Oblique Decision Trees , 1994, J. Artif. Intell. Res..

[38]  Ming Li,et al.  An Introduction to Kolmogorov Complexity and Its Applications , 2019, Texts in Computer Science.

[39]  Tim Kovacs,et al.  Towards a Theory of Strong Overgeneral Classifiers , 2000, FOGA.

[40]  P. Lanzi Extending the representation of classifier conditions part I: from binary to messy coding , 1999 .

[41]  J. Friedman,et al.  Multivariate generalizations of the Wald--Wolfowitz and Smirnov two-sample tests , 1979 .

[42]  Tim Kovacs,et al.  What Makes a Problem Hard for XCS? , 2000, IWLCS.

[43]  Thomas G. Dietterich Approximate Statistical Tests for Comparing Supervised Classification Learning Algorithms , 1998, Neural Computation.

[44]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[45]  Stewart W. Wilson Get Real! XCS with Continuous-Valued Inputs , 1999, Learning Classifier Systems.

[46]  L. Frank,et al.  Pretopological approach for supervised learning , 1996, Proceedings of 13th International Conference on Pattern Recognition.

[47]  Martin V. Butz,et al.  An Algorithmic Description of XCS , 2000, IWLCS.