A study of statistical techniques and performance measures for genetics-based machine learning: accuracy and interpretability

The experimental analysis on the performance of a proposed method is a crucial and necessary task to carry out in a research. This paper is focused on the statistical analysis of the results in the field of genetics-based machine Learning. It presents a study involving a set of techniques which can be used for doing a rigorous comparison among algorithms, in terms of obtaining successful classification models. Two accuracy measures for multi-class problems have been employed: classification rate and Cohen’s kappa. Furthermore, two interpretability measures have been employed: size of the rule set and number of antecedents. We have studied whether the samples of results obtained by genetics-based classifiers, using the performance measures cited above, check the necessary conditions for being analysed by means of parametrical tests. The results obtained state that the fulfillment of these conditions are problem-dependent and indefinite, which supports the use of non-parametric statistics in the experimental analysis. In addition, non-parametric tests can be satisfactorily employed for comparing generic classifiers over various data-sets considering any performance measure. According to these facts, we propose the use of the most powerful non-parametric statistical tests to carry out multiple comparisons. However, the statistical analysis conducted on interpretability must be carefully considered.

[1]  W. Youden,et al.  Index for rating diagnostic tests , 1950, Cancer.

[2]  Jacob Cohen A Coefficient of Agreement for Nominal Scales , 1960 .

[3]  G. Koch The use of non-parametric methods in the statistical analysis of a complex split plot experiment. , 1970, Biometrics.

[4]  S. Holm A Simple Sequentially Rejective Multiple Test Procedure , 1979 .

[5]  R. Iman,et al.  Approximations of the critical region of the fbietkan statistic , 1980 .

[6]  Y. Hochberg A sharper Bonferroni procedure for multiple tests of significance , 1988 .

[7]  S. P. Wright,et al.  Adjusted P-values for simultaneous inference , 1992 .

[8]  Gilles Venturini,et al.  SIA: A Supervised Inductive Algorithm with Genetic Search for Learning Attributes based Concepts , 1993, ECML.

[9]  Sandip Sen,et al.  Using real-valued genetic algorithms to evolve rule sets for classification , 1994, Proceedings of the First IEEE Conference on Evolutionary Computation. IEEE World Congress on Computational Intelligence.

[10]  Stewart W. Wilson ZCS: A Zeroth Level Classifier System , 1994, Evolutionary Computation.

[11]  Stewart W. Wilson Classifier Fitness Based on Accuracy , 1995, Evolutionary Computation.

[12]  J. Shaffer Multiple Hypothesis Testing , 1995 .

[13]  Lorenza Saitta,et al.  A Coevolutionary Approach to Concept Learning , 1997, ISMIS.

[14]  David J. Sheskin,et al.  Handbook of Parametric and Nonparametric Statistical Procedures , 1997 .

[15]  Jukkapekka Hekanaho,et al.  An Evolutionary Approach to Concept Learning , 1998 .

[16]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[17]  Cosimo Anglano,et al.  NOW G-Net: learning classification programs on networks of workstations , 2002, IEEE Trans. Evol. Comput..

[18]  R. Barandelaa,et al.  Strategies for learning in class imbalance problems , 2003, Pattern Recognit..

[19]  Dr. Alex A. Freitas Data Mining and Knowledge Discovery with Evolutionary Algorithms , 2002, Natural Computing Series.

[20]  Ester Bernadó-Mansilla,et al.  Accuracy-Based Learning Classifier Systems: Models, Analysis and Applications to Classification Tasks , 2003, Evolutionary Computation.

[21]  Jaume Bacardit,et al.  Evolving Multiple Discretizations with Adaptive Intervals for a Pittsburgh Rule-Based Learning Classifier System , 2003, GECCO.

[22]  Ethem Alpaydin,et al.  Introduction to machine learning , 2004, Adaptive computation and machine learning.

[23]  Wei-Yin Loh,et al.  A Comparison of Prediction Accuracy, Complexity, and Training Time of Thirty-Three Old and New Classification Algorithms , 2000, Machine Learning.

[24]  Ronald L. Rivest,et al.  Learning decision lists , 2004, Machine Learning.

[25]  K. De Jong,et al.  Using Genetic Algorithms for Concept Learning , 2004, Machine Learning.

[26]  Handbook of Parametric and Nonparametric Statistical Procedures , 2004 .

[27]  Jaume Bacardit,et al.  Analysis and Improvements of the Adaptive Discretization Intervals Knowledge Representation , 2004, GECCO.

[28]  Jaume Bacardit Peñarroya Pittsburgh genetic-based machine learning in the data mining era: representations, generalization, and run-time , 2004 .

[29]  Peter Clark,et al.  The CN2 Induction Algorithm , 1989, Machine Learning.

[30]  Franz Oppacher,et al.  Multiple Species Weighted Voting - A Genetics-Based Machine Learning System , 2004, GECCO.

[31]  George Hripcsak,et al.  Analysis of Variance of Cross-Validation Estimators of the Generalization Error , 2005, J. Mach. Learn. Res..

[32]  Charles X. Ling,et al.  Using AUC and accuracy in evaluating learning algorithms , 2005, IEEE Transactions on Knowledge and Data Engineering.

[33]  Steven Guan,et al.  An incremental approach to genetic-algorithms-based classification , 2005, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[34]  Tin Kam Ho,et al.  Domain of competence of XCS classifier system in complexity measurement space , 2005, IEEE Transactions on Evolutionary Computation.

[35]  Jaume Bacardit,et al.  Bloat Control and Generalization Pressure Using the Minimum Description Length Principle for a Pittsburgh Approach Learning Classifier System , 2005, IWLCS.

[36]  Kay Chen Tan,et al.  A coevolutionary algorithm for rules discovery in data mining , 2006, Int. J. Syst. Sci..

[37]  Robert C. Holte,et al.  Cost curves: An improved method for visualizing classifier performance , 2006, Machine Learning.

[38]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[39]  Jing Liu,et al.  An organizational coevolutionary algorithm for classification , 2006, IEEE Trans. Evol. Comput..

[40]  Stan Szpakowicz,et al.  Beyond Accuracy, F-Score and ROC: A Family of Discriminant Measures for Performance Evaluation , 2006, Australian Conference on Artificial Intelligence.

[41]  Jesús S. Aguilar-Ruiz,et al.  Natural Encoding for Evolutionary Supervised Learning , 2007, IEEE Transactions on Evolutionary Computation.

[42]  Arie Ben-David,et al.  A lot of randomness is hiding in accuracy , 2007, Eng. Appl. Artif. Intell..

[43]  Stewart W. Wilson,et al.  Noname manuscript No. (will be inserted by the editor) Learning Classifier Systems: A Survey , 2022 .

[44]  M. Fu Perturbation Analysis , 2007 .

[45]  Robert P. W. Duin,et al.  Efficient Multiclass ROC Approximation by Decomposition via Confusion Matrix Perturbation Analysis , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[46]  María José del Jesús,et al.  KEEL: a software tool to assess evolutionary algorithms for data mining problems , 2008, Soft Comput..