MFE: Towards reproducible meta-feature extraction

Automated recommendation of machine learning algorithms is receiving a large deal of attention, not only because they can recommend the most suitable algorithms for a new task, but also because they can support efficient hyper-parameter tuning, leading to better machine learning solutions. The automated recommendation can be implemented using meta-learning, learning from previous learning experiences, to create a meta-model able to associate a data set to the predictive performance of machine learning algorithms. Although a large number of publications report the use of meta-learning, reproduction and comparison of meta-learning experiments is a difficult task. The literature lacks extensive and comprehensive public tools that enable the reproducible investigation of the different meta-learning approaches. An alternative to deal with this difficulty is to develop a meta-feature extractor package with the main characterization measures, following uniform guidelines that facilitate the use and inclusion of new meta-features. In this paper, we propose two Meta-Feature Extractor (MFE) packages, written in both Python and R, to fill this lack. The packages follow recent frameworks for meta-feature extraction, aiming to facilitate the reproducibility of meta-learning experiments.

[1]  Andreas Dengel,et al.  Automatic classifier selection for non-experts , 2012, Pattern Analysis and Applications.

[2]  André Carlos Ponce de Leon Ferreira de Carvalho,et al.  Noise detection in the meta-learning level , 2016, Neurocomputing.

[3]  André Carlos Ponce de Leon Ferreira de Carvalho,et al.  A new data characterization for selecting clustering algorithms using meta-learning , 2019, Inf. Sci..

[4]  Qinbao Song,et al.  Automatic recommendation of classification algorithms based on data set characteristics , 2012, Pattern Recognit..

[5]  Kate Smith-Miles,et al.  Cross-disciplinary perspectives on meta-learning for algorithm selection , 2009, CSUR.

[6]  Luís Torgo,et al.  OpenML: networked science in machine learning , 2014, SKDD.

[7]  André Carlos Ponce de Leon Ferreira de Carvalho,et al.  A meta-learning recommender system for hyperparameter tuning: predicting when tuning improves SVM classifiers , 2019, Inf. Sci..

[8]  Carlos Soares,et al.  Sampling-Based Relative Landmarks: Systematically Test-Driving Algorithms Before Choosing , 2001, EPIA.

[9]  João Mendes-Moreira,et al.  Towards Automatic Generation of Metafeatures , 2016, PAKDD.

[10]  Jens Lehmann,et al.  How Complex Is Your Classification Problem? , 2018, ACM Comput. Surv..

[11]  André Carlos Ponce de Leon Ferreira de Carvalho,et al.  Towards Reproducible Empirical Research in Meta-Learning , 2018, ArXiv.

[12]  Rudi Studer,et al.  AST: Support for Algorithm Selection with a CBR Approach , 1999, PKDD.

[13]  Aaron Klein,et al.  Efficient and Robust Automated Machine Learning , 2015, NIPS.

[14]  Ricardo Vilalta,et al.  Metalearning - Applications to Data Mining , 2008, Cognitive Technologies.