Using Metalearning to Predict When Parameter Optimization Is Likely to Improve Classification Accuracy

Work on metalearning for algorithm selection has often been criticized because it mostly considers only the default parameter settings of the candidate base learning algorithms. Many have indeed argued that the choice of parameter values can have a significant impact on accuracy. Yet little empirical evidence exists to provide definitive support for that argument. Recent experiments do suggest that parameter optimization may indeed have an impact. However, the distribution of performance differences has a long tail, suggesting that in most cases parameter optimization has little effect on accuracy. In this paper, we revisit some of these results and use metalearning to characterize the situations when parameter optimization is likely to cause a significant increase in accuracy. In so doing, we show that 1) a relatively simple and efficient landmarker carries significant predictive power, and 2) metalearning for algorithm selection should be effected in two phases, the first in which one determines whether parameter optimization is likely to increase accuracy, and the second in which algorithm selection actually takes place.

[1]  Andreas Dengel,et al.  Meta-learning for evolutionary parameter optimization of classifiers , 2012, Machine Learning.

[2]  Hilan Bensusan,et al.  Meta-Learning by Landmarking Various Learning Algorithms , 2000, ICML.

[3]  Robert Engels,et al.  Using a Data Metric for Preprocessing Advice for Data Mining Applications , 1998, ECAI.

[4]  Ricardo Vilalta,et al.  Metalearning - Applications to Data Mining , 2008, Cognitive Technologies.

[5]  Tony R. Martinez,et al.  Recommending Learning Algorithms and Their Associated Hyperparameters , 2014, MetaSel@ECAI.

[6]  André Carlos Ponce de Leon Ferreira de Carvalho,et al.  Combining meta-learning and search techniques to select parameters for support vector machines , 2012, Neurocomputing.

[7]  Matthew A North,et al.  Data Mining for the Masses , 2012 .

[8]  Matthias Reif A Comprehensive Dataset for Evaluating Approaches of Various Meta-learning Tasks , 2012, ICPRAM.

[9]  So Young Sohn,et al.  Meta Analysis of Classification Algorithms for Pattern Recognition , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  Andreas Dengel,et al.  Automatic classifier selection for non-experts , 2012, Pattern Analysis and Applications.

[11]  Kevin Leyton-Brown,et al.  Auto-WEKA: combined selection and hyperparameter optimization of classification algorithms , 2012, KDD.

[12]  Peter A. Flach,et al.  Improved Dataset Characterisation for Meta-learning , 2002, Discovery Science.

[13]  Hilan Bensusan,et al.  A Higher-order Approach to Meta-learning , 2000, ILP Work-in-progress reports.

[14]  David J. Spiegelhalter,et al.  Machine Learning, Neural and Statistical Classification , 2009 .

[15]  Quan Sun,et al.  Pairwise meta-rules for better meta-learning-based algorithm ranking , 2013, Machine Learning.

[16]  Teresa Bernarda Ludermir,et al.  Predicting the Performance of Learning Algorithms Using Support Vector Machines as Meta-regressors , 2008, ICANN.

[17]  Joaquin Vanschoren,et al.  Selecting Classification Algorithms with Active Testing , 2012, MLDM.

[18]  Johannes Fürnkranz,et al.  An Evaluation of Landmarking Variants , 2001 .

[19]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[20]  Hilan Bensusan Odd bites into bananas don''t make you blind: learning about simplicity and attribute addition , 1998 .

[21]  Carlos Soares,et al.  Ranking Learning Algorithms: Using IBL and Meta-Learning on Accuracy and Time Results , 2003, Machine Learning.

[22]  Hilan Bensusan,et al.  Discovering Task Neighbourhoods Through Landmark Learning Performances , 2000, PKDD.

[23]  Hilan Bensusan,et al.  Estimating the Predictive Accuracy of a Classifier , 2001, ECML.