Practical Automated Machine Learning for the AutoML Challenge 2018

Despite great successes in many fields, machine learning typically requires substantial human resources to determine a good machine learning pipeline (including various types of preprocessing, and the choice of classifiers and hyperparameters). AutoML aims to free human practitioners and researchers from these menial tasks. The current state-of-the-art in AutoML has been evaluated in the AutoML challenge 2018. Here, we describe our winning entry to this challenge, dubbed PoSH Auto-sklearn, which combines an automatically preselected portfolio, ensemble building and Bayesian optimization with successive halving. Finally, we share insights in the importance of different parts of our approach.

[1]  Rich Caruana,et al.  Ensemble selection from libraries of models , 2004, ICML.

[2]  Pierre Geurts,et al.  Extremely randomized trees , 2006, Machine Learning.

[3]  Rich Caruana,et al.  Getting the Most Out of Ensemble Selection , 2006, Sixth International Conference on Data Mining (ICDM'06).

[4]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[5]  Kevin Leyton-Brown,et al.  Sequential Model-Based Optimization for General Algorithm Configuration , 2011, LION.

[6]  Kevin Leyton-Brown,et al.  Auto-WEKA: combined selection and hyperparameter optimization of classification algorithms , 2012, KDD.

[7]  Luís Torgo,et al.  OpenML: networked science in machine learning , 2014, SKDD.

[8]  Andreas Krause,et al.  Submodular Function Maximization , 2014, Tractability.

[9]  Ran Gilad-Bachrach,et al.  DART: Dropouts meet Multiple Additive Regression Trees , 2015, AISTATS.

[10]  Aaron Klein,et al.  Efficient and Robust Automated Machine Learning , 2015, NIPS.

[11]  Frank Hutter,et al.  Initializing Bayesian Hyperparameter Optimization via Meta-Learning , 2015, AAAI.

[12]  Sergio Escalera,et al.  Design of the 2015 ChaLearn AutoML challenge , 2015, 2015 International Joint Conference on Neural Networks (IJCNN).

[13]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[14]  Ameet Talwalkar,et al.  Non-stochastic Best Arm Identification and Hyperparameter Optimization , 2015, AISTATS.

[15]  Ameet Talwalkar,et al.  Hyperband: Bandit-Based Configuration Evaluation for Hyperparameter Optimization , 2016, ICLR.

[16]  Aaron Klein,et al.  BOHB: Robust and Efficient Hyperparameter Optimization at Scale , 2018, ICML.

[17]  Sergio Escalera,et al.  Analysis of the AutoML Challenge Series 2015-2018 , 2019, Automated Machine Learning.