Auto-CVE: a coevolutionary approach to evolve ensembles in automated machine learning

Automated Machine Learning (Auto-ML) is a growing field receiving a lot of attention. Several techniques are being developed to address the question of how to automate the process of defining machine learning pipelines, using diverse types of approaches and with relative success, but still this problem is far from being solved. Ensembles are frequently employed in machine learning given their better performance, when compared to the use of a single model, and higher robustness. However, until now, not much attention has been given to them in the Auto-ML field. In this sense, this work presents Auto-CVE (Automated Coevolutionary Voting Ensemble) a new approach to Auto-ML. Based on a coevolutionary model, it uses two populations (one of ensembles and another for components) to actively search for voting ensembles. When compared to the popular algorithm TPOT, Auto-CVE shows competitive results in both accuracy and computing time.

[1]  Gisele L. Pappa,et al.  RECIPE: A Grammar-Based Framework for Automatically Evolving Classification Pipelines , 2017, EuroGP.

[2]  Aaron Klein,et al.  Efficient and Robust Automated Machine Learning , 2015, NIPS.

[3]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[4]  Kevin Leyton-Brown,et al.  Auto-WEKA: combined selection and hyperparameter optimization of classification algorithms , 2012, KDD.

[5]  Dirk Van,et al.  Ensemble Methods: Foundations and Algorithms , 2012 .

[6]  Hod Lipson,et al.  Autostacker: a compositional evolutionary learning system , 2018, GECCO.

[7]  Alexander Allen,et al.  Benchmarking Automatic Machine Learning Frameworks , 2018, ArXiv.

[8]  Yolanda Gil,et al.  P4ML: A Phased Performance-Based Pipeline Planner for Automated Machine Learning , 2018 .

[9]  Peter A. Whigham,et al.  Grammatically-based Genetic Programming , 1995 .

[10]  Eric Bauer,et al.  An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants , 1999, Machine Learning.

[11]  Thomas G. Dietterich Multiple Classifier Systems , 2000, Lecture Notes in Computer Science.

[12]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[13]  Thomas Jansen,et al.  Exploring the Explorative Advantage of the Cooperative Coevolutionary (1+1) EA , 2003, GECCO.

[14]  Samir W. Mahfoud Crowding and Preselection Revisited , 1992, PPSN.

[15]  Randal S. Olson,et al.  TPOT: A Tree-based Pipeline Optimization Tool for Automating Machine Learning , 2016, AutoML@ICML.

[16]  F. Hutter,et al.  Practical Automated Machine Learning for the AutoML Challenge 2018 , 2018 .

[17]  Jason H. Moore,et al.  ExSTraCS 2.0: description and evaluation of a scalable learning classifier system , 2015, Evolutionary Intelligence.

[18]  Nelson F. F. Ebecken,et al.  Coevolutionary multi-population genetic programming for data classification , 2010, GECCO '10.