Addressing the Algorithm Selection Problem through an Attention-Based Meta-Learner Approach

In the algorithm selection problem, where the task is to identify the most suitable solving technique for a particular situation, most methods used as performance mapping mechanisms have been relatively simple models such as logistic regression or neural networks. In the latter case, most implementations tend to have a shallow and straightforward architecture and, thus, exhibit a limited ability to extract relevant patterns. This research explores the use of attention-based neural networks as meta-learners to improve the performance mapping mechanism in the algorithm selection problem and fully take advantage of the model’s capabilities for pattern extraction. We compare the proposed use of an attention-based meta-learner method as a performance mapping mechanism against five models from the literature: multi-layer perceptron, k-nearest neighbors, softmax regression, support vector machines, and decision trees. We used a meta-data dataset obtained by solving the vehicle routing problem with time window (VRPTW) instances contained in the Solomon benchmark with three different configurations of the simulated annealing meta-heuristic for testing purposes. Overall, the attention-based meta-learner model yields better results when compared to the other benchmark methods in consistently selecting the algorithm that best solves a given VRPTW instance. Moreover, by significantly outperforming the multi-layer perceptron, our findings suggest promising potential in exploring more recent and novel advancements in neural network architectures.

[1]  S. Aslani,et al.  Utilisation of deep learning for COVID-19 diagnosis , 2023, Clinical Radiology.

[2]  L. Rokach,et al.  Learning dataset representation for automatic machine learning algorithm selection , 2022, Knowledge and Information Systems.

[3]  Z. Yin,et al.  A nearest neighbor multiple-point statistics method for fast geological modeling , 2022, Comput. Geosci..

[4]  Weiling Fu,et al.  Evaluation of classification ability of logistic regression model on SERS data of miRNAs , 2022, Journal of biophotonics.

[5]  Zamil S. Alzamil,et al.  A Review on Deep Learning Techniques for IoT Data , 2022, Electronics.

[6]  Shlomo Shamai Shitz,et al.  Ball-Tree-Based Signal Detection for LMA MIMO Systems , 2022, IEEE Communications Letters.

[7]  Sheikh Amir Fayaz,et al.  Is Decision Tree Obsolete in Its Original Form? A Burning Debate , 2022, Revue d'Intelligence Artificielle.

[8]  Dr. Mahesh Bundele,et al.  DECISIVE ANALYSIS OF MULTIPLE LOGISTIC REGRESSION APROPOS OF HYPER-PARAMETERS , 2022, Indian Journal of Computer Science and Engineering.

[9]  M. A. Abdou Literature review: efficient deep neural networks techniques for medical image analysis , 2022, Neural Computing and Applications.

[10]  C. Duwig,et al.  Identifying key features in reactive flows: A tutorial on combining dimensionality reduction, unsupervised clustering, and feature correlation , 2022, Chemical Engineering Journal.

[11]  J. Mäkelä,et al.  SLISEMAP: supervised dimensionality reduction through local explanations , 2022, Machine Learning.

[12]  Benoît Frénay,et al.  Constraint Enforcement on Decision Trees: A Survey , 2022, ACM Comput. Surv..

[13]  E. Pesch,et al.  An algorithm selection approach for the flexible job shop scheduling problem: Choosing constraint programming solvers through machine learning , 2022, Eur. J. Oper. Res..

[14]  W. Marańda,et al.  Selected Genetic Algorithms for Vehicle Routing Problem Solving , 2021, Electronics.

[15]  Antonio Candelieri,et al.  Bayesian optimization and deep learning for steering wheel angle prediction , 2021, Scientific reports.

[16]  Sancheng Peng,et al.  A survey on deep learning for textual emotion analysis in social networks , 2021, Digit. Commun. Networks.

[17]  B. Chaudhuri,et al.  Activation functions in deep learning: A comprehensive survey and benchmark , 2021, Neurocomputing.

[18]  F. Hutter,et al.  SMAC3: A Versatile Bayesian Optimization Package for Hyperparameter Optimization , 2021, J. Mach. Learn. Res..

[19]  Hugo Terashima-Marín,et al.  Algorithm selection for solving educational timetabling problems , 2021, Expert Syst. Appl..

[20]  Deniz Erdogmus,et al.  Stochastic Mutual Information Gradient Estimation for Dimensionality Reduction Networks , 2021, Inf. Sci..

[21]  El-Ghazali Talbi,et al.  Machine learning at the service of meta-heuristics for solving combinatorial optimization problems: A state-of-the-art , 2021, Eur. J. Oper. Res..

[22]  Hui Yu,et al.  A review on the attention mechanism of deep learning , 2021, Neurocomputing.

[23]  Yangdong Ye,et al.  Deep multi-view learning methods: A review , 2021, Neurocomputing.

[24]  Francisco Herrera,et al.  A Practical Tutorial for Decision Tree Induction , 2021, ACM Comput. Surv..

[25]  J. Goeman,et al.  Multiple testing; when is many too much? , 2020, European journal of endocrinology.

[26]  Aske Plaat,et al.  A survey of deep meta-learning , 2020, Artificial Intelligence Review.

[27]  Sotiris P. Gayialis,et al.  Vehicle routing problem and related algorithms for logistics distribution: a literature review and classification , 2020, Operational Research.

[28]  Gulshan Kumar,et al.  A Survey of Deep Learning and Its Applications: A New Paradigm to Machine Learning , 2019, Archives of Computational Methods in Engineering.

[29]  Jorge M. Cruz-Duarte,et al.  A Systematic Review of Hyper-Heuristics on Combinatorial Optimization Problems , 2020, IEEE Access.

[30]  Padraig Cunningham,et al.  k-Nearest Neighbour Classifiers - A Tutorial , 2020, ACM Comput. Surv..

[31]  Fatma A. Ibrahim,et al.  Data Mining: WEKA Software ( an Overview ) , 2019 .

[32]  Gang Liu,et al.  Bidirectional LSTM with attention mechanism and convolutional layer for text classification , 2019, Neurocomputing.

[33]  Hugo Terashima-Marín,et al.  Selecting meta-heuristics for solving vehicle routing problems with time windows via meta-learning , 2019, Expert Syst. Appl..

[34]  Hang Lei,et al.  Hyperparameter Optimization for Machine Learning Models Based on Bayesian Optimization , 2019 .

[35]  Sebastian Raschka,et al.  MLxtend: Providing machine learning and data science utilities and extensions to Python's scientific computing stack , 2018, J. Open Source Softw..

[36]  Mohamed Ettaouil,et al.  Multilayer Perceptron: Architecture Optimization and training with mixed activation functions , 2017, BDCA'17.

[37]  André Carlos Ponce de Leon Ferreira de Carvalho,et al.  Meta-learning to select the best meta-heuristic for the Traveling Salesman Problem: A comparison of meta-features , 2016, Neurocomputing.

[38]  Lars Kotthoff,et al.  Algorithm Selection for Combinatorial Search Problems: A Survey , 2012, AI Mag..

[39]  Kate Smith-Miles,et al.  Measuring instance difficulty for combinatorial optimization problems , 2012, Comput. Oper. Res..

[40]  Jorge Nocedal,et al.  Remark on “algorithm 778: L-BFGS-B: Fortran subroutines for large-scale bound constrained optimization” , 2011, TOMS.

[41]  André Carlos Ponce de Leon Ferreira de Carvalho,et al.  Selection of algorithms to solve traveling salesman problems using meta-learning , 2011, Int. J. Hybrid Intell. Syst..

[42]  Jano I. van Hemert,et al.  Discovering the suitability of optimisation algorithms by learning from evolved instances , 2011, Annals of Mathematics and Artificial Intelligence.

[43]  Kate Smith-Miles,et al.  Cross-disciplinary perspectives on meta-learning for algorithm selection , 2009, CSUR.

[44]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[45]  Y. Ho,et al.  Simple Explanation of the No-Free-Lunch Theorem and Its Implications , 2002 .

[46]  Loo Hay Lee,et al.  Heuristic methods for vehicle routing problem with time windows , 2001, Artif. Intell. Eng..

[47]  David H. Wolpert,et al.  No free lunch theorems for optimization , 1997, IEEE Trans. Evol. Comput..

[48]  Marius M. Solomon,et al.  Algorithms for the Vehicle Routing and Scheduling Problems with Time Window Constraints , 1987, Oper. Res..

[49]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[50]  Q. Mcnemar Note on the sampling error of the difference between correlated proportions or percentages , 1947, Psychometrika.

[51]  Lunke Fei,et al.  Locality preserving projection with symmetric graph embedding for unsupervised dimensionality reduction , 2022, Pattern Recognit..

[52]  Hiram Ponce,et al.  Advances in Soft Computing: 19th Mexican International Conference on Artificial Intelligence, MICAI 2020, Mexico City, Mexico, October 12–17, 2020, Proceedings, Part I , 2020, MICAI.

[53]  M. Lübbecke Column Generation , 2010 .

[54]  John R. Rice,et al.  The Algorithm Selection Problem , 1976, Adv. Comput..