Multi-Objective Evolutionary Design of Composite Data-Driven Models

In this paper, a multi-objective approach for the design of composite data-driven mathematical models is proposed. It allows automating the identification of graph-based heterogeneous pipelines that consist of different blocks: machine learning models, data preprocessing blocks, etc. The implemented approach is based on a parameter-free genetic algorithm (GA) for model design called GPComp@Free. It is developed to be part of automated machine learning solutions and to increase the efficiency of the modeling pipeline automation. A set of experiments was conducted to verify the correctness and efficiency of the proposed approach and substantiate the selected solutions. The experimental results confirm that a multi-objective approach to the model design allows us to achieve better diversity and quality of obtained models. The implemented approach is available as a part of the open-source AutoML framework FEDOT.

[1]  Geoffrey J. Gordon,et al.  DeepArchitect: Automatically Designing and Training Deep Architectures , 2017, ArXiv.

[2]  Stephan M. Winkler,et al.  New methods for the identification of nonlinear model structures based upon genetic programming techniques , 2005 .

[3]  Kalyanmoy Deb,et al.  NSGA-Net: neural architecture search using multi-objective genetic algorithm , 2018, GECCO.

[4]  Marc Parizeau,et al.  DEAP: enabling nimbler evolutions , 2014, SEVO.

[5]  Xi Chen,et al.  Evolution Strategies as a Scalable Alternative to Reinforcement Learning , 2017, ArXiv.

[6]  Randal S. Olson,et al.  TPOT: A Tree-based Pipeline Optimization Tool for Automating Machine Learning , 2016, AutoML@ICML.

[7]  Marco Laumanns,et al.  SPEA2: Improving the strength pareto evolutionary algorithm , 2001 .

[8]  Mengjie Zhang,et al.  An Adaptive and Near Parameter-free Evolutionary Computation Approach Towards True Automation in AutoML , 2020, 2020 IEEE Congress on Evolutionary Computation (CEC).

[9]  Structural Evolutionary Learning for Composite Classification Models , 2020, Procedia Computer Science.

[10]  Lior Rokach,et al.  DeepLine: AutoML Tool for Pipelines Generation using Deep Reinforcement Learning and Hierarchical Actions Filtering , 2019, KDD.

[11]  Mikhail Maslyaev,et al.  The data-driven physical-based equations discovery using evolutionary approach , 2020, GECCO Companion.

[12]  Nikolay O. Nikitin,et al.  Automatic evolutionary learning of composite models with knowledge enrichment , 2020, GECCO Companion.

[13]  Adam Belloum,et al.  Execution Time Estimation for Workflow Scheduling , 2014, 2014 9th Workshop on Workflows in Support of Large-Scale Science.

[14]  Yiming Yang,et al.  DARTS: Differentiable Architecture Search , 2018, ICLR.

[15]  E. LeDell,et al.  H2O AutoML: Scalable Automatic Machine Learning , 2020 .

[16]  Nikolay O. Nikitin,et al.  Towards Generative Design of Computationally Efficient Mathematical Models with Evolutionary Learning , 2020, Entropy.

[17]  Alok Aggarwal,et al.  Regularized Evolution for Image Classifier Architecture Search , 2018, AAAI.

[18]  Guangming Shi,et al.  DarwinML: A Graph-based Evolutionary Algorithm for Automated Machine Learning , 2019, ArXiv.

[19]  Bernd Bischl,et al.  Multi-Objective Automatic Machine Learning with AutoxgboostMC , 2019, ArXiv.

[20]  Byran J. Smucker,et al.  On using the hypervolume indicator to compare Pareto fronts: Applications to multi-criteria optimal experimental design , 2015 .

[21]  Kenneth O. Stanley,et al.  Open-Ended Evolution and Open-Endedness: Editorial Introduction to the Open-Ended Evolution I Special Issue , 2019, Artificial Life.

[22]  Frank Hutter,et al.  Towards Further Automation in AutoML , 2018 .

[23]  Aaron Klein,et al.  Auto-sklearn: Efficient and Robust Automated Machine Learning , 2019, Automated Machine Learning.

[24]  Randal S. Olson,et al.  PMLB: a large benchmark suite for machine learning evaluation and comparison , 2017, BioData Mining.

[25]  Bohdan Pavlyshenko,et al.  Using Stacking Approaches for Machine Learning Models , 2018, 2018 IEEE Second International Conference on Data Stream Mining & Processing (DSMP).

[26]  Bernd Bischl,et al.  Multi-objective hyperparameter tuning and feature selection using filter ensembles , 2020, GECCO.

[27]  Y. M. Ram,et al.  Mathematical Modelling: Concepts and Case Studies , 1999 .

[28]  Kalyanmoy Deb,et al.  A fast and elitist multiobjective genetic algorithm: NSGA-II , 2002, IEEE Trans. Evol. Comput..