On the statistical analysis of the parameters’ trend in a machine learning algorithm

Statistical validation of results for supporting the conclusions achieved in an experimental study is more and more demanded in research results. Although statistics are usually used in the analysis of results for comparing the performance of several algorithms, they could be used in other tasks, such as proper selection of parameters’s value or study of the trend of a parameter. In this short paper, we describe a non-parametric test, the Page test, which can be used for predicting the order of experimental conditions. We include an illustrative example for using it on classification problems taking the well-known $$k$$k-nearest neighbour algorithm.

[1]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[2]  Hakan Altinçay,et al.  Decision trees using model ensemble-based nodes , 2007, Pattern Recognit..

[3]  S. García,et al.  An Extension on "Statistical Comparisons of Classifiers over Multiple Data Sets" for all Pairwise Comparisons , 2008 .

[4]  ZhangChangshui,et al.  An experimental evaluation of ensemble methods for EEG signal classification , 2007 .

[5]  W. J. Conover,et al.  Practical Nonparametric Statistics , 1972 .

[6]  David J. Sheskin,et al.  Handbook of Parametric and Nonparametric Statistical Procedures , 1997 .

[7]  M. F. Fuller,et al.  Practical Nonparametric Statistics; Nonparametric Statistical Inference , 1973 .

[8]  Jean Dickinson Gibbons,et al.  Nonparametric Statistical Inference , 1972, International Encyclopedia of Statistical Science.

[9]  Shiliang Sun,et al.  An experimental evaluation of ensemble methods for EEG signal classification , 2007, Pattern Recognit. Lett..

[10]  J. H. Zar,et al.  Biostatistical Analysis (5th Edition) , 1984 .

[11]  Mario Cortina-Borja,et al.  Handbook of Parametric and Nonparametric Statistical Procedures, 5th edn , 2012 .

[12]  Mark A. Girolami,et al.  An empirical analysis of the probabilistic K-nearest neighbour classifier , 2007, Pattern Recognit. Lett..

[13]  Francisco Herrera,et al.  A practical tutorial on the use of nonparametric statistical tests as a methodology for comparing evolutionary and swarm intelligence algorithms , 2011, Swarm Evol. Comput..

[14]  Francisco Herrera,et al.  Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: Experimental analysis of power , 2010, Inf. Sci..

[15]  Francisco Herrera,et al.  A study on the use of non-parametric tests for analyzing the evolutionary algorithms’ behaviour: a case study on the CEC’2005 Special Session on Real Parameter Optimization , 2009, J. Heuristics.

[16]  E. B. Page Ordered Hypotheses for Multiple Treatments: A Significance Test for Linear Ranks , 1963 .