Cross-disciplinary perspectives on meta-learning for algorithm selection

The algorithm selection problem [Rice 1976] seeks to answer the question: Which algorithm is likely to perform best for my problem? Recognizing the problem as a learning task in the early 1990's, the machine learning community has developed the field of meta-learning, focused on learning about learning algorithm performance on classification problems. But there has been only limited generalization of these ideas beyond classification, and many related attempts have been made in other disciplines (such as AI and operations research) to tackle the algorithm selection problem in different ways, introducing different terminology, and overlooking the similarities of approaches. In this sense, there is much to be gained from a greater awareness of developments in meta-learning, and how these ideas can be generalized to learn about the behaviors of other (nonlearning) algorithms. In this article we present a unified framework for considering the algorithm selection problem as a learning problem, and use this framework to tie together the crossdisciplinary developments in tackling the algorithm selection problem. We discuss the generalization of meta-learning concepts to algorithms focused on tasks including sorting, forecasting, constraint satisfaction, and optimization, and the extension of these ideas to bioinformatics, cryptography, and other fields.

[1]  Jian Yang,et al.  Algorithm Selection: A Quantitative Approach , 2006 .

[2]  Ivana Kruijff-Korbayová,et al.  A Portfolio Approach to Algorithm Selection , 2003, IJCAI.

[3]  Ricardo Vilalta,et al.  A Perspective View and Survey of Meta-Learning , 2002, Artificial Intelligence Review.

[4]  Toby Walsh,et al.  Backbones in Optimization and Approximation , 2001, IJCAI.

[5]  Victor Ciesielski,et al.  Matching Data Mining Algorithm Suitability to Data Characteristics Using a Self-Organizing Map , 2001, HIS.

[6]  Horst Samulowitz,et al.  Learning to Solve QBF , 2007, AAAI.

[7]  Carla E. Brodley,et al.  Addressing the Selective Superiority Problem: Automatic Algorithm/Model Class Selection , 1993 .

[8]  H. Rice Classes of recursively enumerable sets and their decision problems , 1953 .

[9]  John R. Rice,et al.  The Algorithm Selection Problem , 1976, Adv. Comput..

[10]  Giovanna Castellano,et al.  Meta-data: Characterization of Input Features for Meta-learning , 2005, MDAI.

[11]  W. Hsu,et al.  Algorithm selection for sorting and probabilistic inference: a machine learning-based approach , 2003 .

[12]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[13]  Kate Smith-Miles,et al.  Kernal Width Selection for SVM Classification: A Meta-Learning Approach , 2005, Int. J. Data Warehous. Min..

[14]  A. Gupta,et al.  A Bayesian Approach to , 1997 .

[15]  Jürgen Schmidhuber,et al.  Learning dynamic algorithm portfolios , 2006, Annals of Mathematics and Artificial Intelligence.

[16]  Vincent Bachelet Métaheuristiques parallèles hybrides : application au problème d'affection quadratique , 1999 .

[17]  Charles C. Taylor,et al.  Meta-Analysis: From Data Characterisation for Meta-Learning to Meta-Regression , 2000 .

[18]  Jano I. van Hemert,et al.  Evolving Combinatorial Problem Instances That Are Difficult to Solve , 2006, Evolutionary Computation.

[19]  Michail G. Lagoudakis,et al.  Selecting the Right Algorithm , 2001 .

[20]  Rudi Studer,et al.  AST: Support for Algorithm Selection with a CBR Approach , 1999, PKDD.

[21]  Johannes Fürnkranz,et al.  An Evaluation of Landmarking Variants , 2001 .

[22]  João Gama,et al.  Characterization of Classification Algorithms , 1995, EPIA.

[23]  Jeffrey Sohl,et al.  An intelligent model selection and forecasting system , 1999 .

[24]  Nigel Meade,et al.  A comparison of the accuracy of short term foreign exchange forecasting methods , 2002 .

[25]  Kevin Leyton-Brown,et al.  Hierarchical Hardness Models for SAT , 2007, CP.

[26]  David H. Wolpert,et al.  No free lunch theorems for optimization , 1997, IEEE Trans. Evol. Comput..

[27]  Stephen F. Smith,et al.  Combining Multiple Heuristics Online , 2007, AAAI.

[28]  Peter A. Flach,et al.  Improved Dataset Characterisation for Meta-learning , 2002, Discovery Science.

[29]  Hilan Bensusan,et al.  A Higher-order Approach to Meta-learning , 2000, ILP Work-in-progress reports.

[30]  Hector J. Levesque,et al.  Generating Hard Satisfiability Problems , 1996, Artif. Intell..

[31]  Yoav Shoham,et al.  Learning the Empirical Hardness of Optimization Problems: The Case of Combinatorial Auctions , 2002, CP.

[32]  Ingrid Zukerman,et al.  A Meta-learning Approach for Selecting between Response Automation Strategies in a Help-desk Domain , 2007, AAAI.

[33]  Bay Arinze,et al.  Combining and selecting forecasting models using rule based induction , 1997, Comput. Oper. Res..

[34]  Larry A. Rendell,et al.  Empirical learning as a function of concept character , 2004, Machine Learning.

[35]  W. Armstrong,et al.  Dynamic Algorithm Selection Using Reinforcement Learning , 2006, 2006 International Workshop on Integrating AI and Data Mining.

[36]  Michel Gendreau,et al.  Handbook of Metaheuristics , 2010 .

[37]  Yoav Shoham,et al.  Boosting as a Metaphor for Algorithm Design , 2003, CP.

[38]  Kate A. Smith,et al.  Characteristic-based Forecasting for Time Series Data , 2005 .

[39]  Kate Smith-Miles,et al.  On learning algorithm selection for classification , 2006, Appl. Soft Comput..

[40]  Howard Cho,et al.  Empirical Learning as a Function of Concept Character , 1990, Machine Learning.

[41]  Xiaozhe Wang,et al.  Characteristic-Based Clustering for Time Series Data , 2006, Data Mining and Knowledge Discovery.

[42]  Constantin Halatsis,et al.  Measures of Intrinsic Hardness for Constraint Satisfaction Problem Instances , 2004, SOFSEM.

[43]  Teuvo Kohonen,et al.  Self-Organization and Associative Memory , 1988 .

[44]  Peter Merz,et al.  Advanced Fitness Landscape Analysis and the Performance of Memetic Algorithms , 2004, Evolutionary Computation.

[45]  Hilan Bensusan,et al.  Meta-Learning by Landmarking Various Learning Algorithms , 2000, ICML.

[46]  Ian Witten,et al.  Data Mining , 2000 .

[47]  S. Schiffer,et al.  ANALYSIS OF THE RESULTS , 1971 .

[48]  Kalousis Alexandros,et al.  MODEL SELECTION VIA META-LEARNING: A COMPARATIVE STUDY , 2001 .

[49]  Thomas Stützle,et al.  Towards a Characterisation of the Behaviour of Stochastic Local Search Algorithms for SAT , 1999, Artif. Intell..

[50]  Bay Arinze,et al.  Selecting appropriate forecasting models using rule induction , 1994 .

[51]  David J. Spiegelhalter,et al.  Machine Learning, Neural and Statistical Classification , 2009 .

[52]  Assaf Naor,et al.  Rigorous location of phase transitions in hard optimization problems , 2005, Nature.

[53]  Carlos Soares,et al.  Exploiting Sampling and Meta-learning for Parameter Setting forSupport Vector Machines , 2002 .

[54]  F. Glover,et al.  Handbook of Metaheuristics , 2019, International Series in Operations Research & Management Science.

[55]  Kevin Leyton-Brown,et al.  : The Design and Analysis of an Algorithm Portfolio for SAT , 2007, CP.

[56]  Hilan Bensusan,et al.  Estimating the Predictive Accuracy of a Classifier , 2001, ECML.

[57]  Carlos Soares,et al.  A Meta-Learning Method to Select the Kernel Width in Support Vector Regression , 2004, Machine Learning.

[58]  D. Opitz,et al.  Popular Ensemble Methods: An Empirical Study , 1999, J. Artif. Intell. Res..

[59]  Abraham Bernstein,et al.  Toward intelligent assistance for a data mining process: an ontology-based approach for cost-sensitive classification , 2005, IEEE Transactions on Knowledge and Data Engineering.

[60]  Melanie Hilario Model Complexity and Algorithm Selection in Classification , 2002, Discovery Science.

[61]  Alexandros Kalousis,et al.  NOEMON: Design, implementation and performance results of an intelligent assistant for classifier selection , 1999, Intell. Data Anal..

[62]  Panagiotis Stamatopoulos,et al.  Combinatorial optimization through statistical instance-based learning , 2001, Proceedings 13th IEEE International Conference on Tools with Artificial Intelligence. ICTAI 2001.

[63]  Eamonn J. Keogh,et al.  UCR Time Series Data Mining Archive , 1983 .

[64]  David J. Slate,et al.  Letter Recognition Using Holland-Style Adaptive Classifiers , 1991, Machine Learning.

[65]  Riccardo Poli,et al.  Kolmogorov complexity, Optimization and Hardness , 2006, 2006 IEEE International Conference on Evolutionary Computation.

[66]  Yoav Shoham,et al.  Understanding Random SAT: Beyond the Clauses-to-Variables Ratio , 2004, CP.

[67]  共立出版株式会社 コンピュータ・サイエンス : ACM computing surveys , 1978 .

[68]  Philip K. Chan,et al.  Meta-learning in distributed data mining systems: Issues and approaches , 2007 .

[69]  Teuvo Kohonen,et al.  Self-organization and associative memory: 3rd edition , 1989 .

[70]  KATE A. SMITH,et al.  Modelling the relationship between problem characteristics and data mining algorithm performance using neural networks , 2001 .

[71]  Carlos Soares,et al.  Sampling-Based Relative Landmarks: Systematically Test-Driving Algorithms Before Choosing , 2001, EPIA.

[72]  Wei-Yin Loh,et al.  A Comparison of Prediction Accuracy, Complexity, and Training Time of Thirty-Three Old and New Classification Algorithms , 2000, Machine Learning.

[73]  Daren Ler Utilising Regression-based Landmarkers within a Meta-learning Framework for Algorithm Selection , 2005 .

[74]  C. E. SHANNON,et al.  A mathematical theory of communication , 1948, MOCO.

[75]  Iain Paterson,et al.  Evaluation of Machine-Learning Algorithm Ranking Advisors , 2000 .

[76]  Vassilis Zissimopoulos,et al.  On the Hardness of the Quadratic Assignment Problem with Metaheuristics , 2002, J. Heuristics.

[77]  Carlos Soares,et al.  Ranking Learning Algorithms: Using IBL and Meta-Learning on Accuracy and Time Results , 2003, Machine Learning.

[78]  Mark Wallace,et al.  Finding the Right Hybrid Algorithm - A Combinatorial Meta-Problem , 2000, Electron. Notes Discret. Math..

[79]  Terry Jones,et al.  Fitness Distance Correlation as a Measure of Problem Difficulty for Genetic Algorithms , 1995, ICGA.

[80]  Salvatore J. Stolfo,et al.  On the Accuracy of Meta-learning for Scalable Data Mining , 2004, Journal of Intelligent Information Systems.

[81]  Ramanathan Gnanadesikan Methods for Statistical Data Analysis of Multivariate Observations: Gnanadesikan/Methods , 1997 .

[82]  Chao-Hsien Chu,et al.  Neural network system for forecasting method selection , 1994, Decis. Support Syst..

[83]  David W. Aha,et al.  Generalizing from Case studies: A Case Study , 1992, ML.

[84]  Thomas Stützle,et al.  New Benchmark Instances for the QAP and the Experimental Analysis of Algorithms , 2004, EvoCOP.

[85]  Derick Wood,et al.  A survey of adaptive sorting algorithms , 1992, CSUR.

[86]  Andrew W. Moore,et al.  The Racing Algorithm: Model Selection for Lazy Learners , 1997, Artificial Intelligence Review.

[87]  David W. Corne,et al.  Towards Landscape Analyses to Inform the Design of Hybrid Local Search for the Multiobjective Quadratic Assignment Problem , 2002, HIS.

[88]  David Maxwell Chickering,et al.  A Bayesian Approach to Tackling Hard Computational Problems (Preliminary Report) , 2001, Electron. Notes Discret. Math..

[89]  Kate Smith-Miles,et al.  A meta-learning approach to automatic kernel selection for support vector machines , 2006, Neurocomputing.

[90]  Elwood S. Buffa,et al.  The Facilities Layout Problem in Perspective , 1966 .

[91]  T. Martinez,et al.  Estimating The Potential for Combining Learning Models , 2005 .

[92]  Graham Kendall,et al.  Hyper-Heuristics: An Emerging Direction in Modern Search Technology , 2003, Handbook of Metaheuristics.

[93]  David H. Wolpert,et al.  Stacked generalization , 1992, Neural Networks.

[94]  Saso Dzeroski,et al.  Ranking with Predictive Clustering Trees , 2002, ECML.

[95]  Kate Smith-Miles,et al.  Towards insightful algorithm selection for optimisation using meta-learning concepts , 2008, 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence).

[96]  Jano I van Hemert,et al.  Evolving combinatorial problem instances that are difficult to solve. , 2006, Evolutionary computation.

[97]  Saso Dzeroski,et al.  Combining Classifiers with Meta Decision Trees , 2003, Machine Learning.

[98]  P. Brazdil,et al.  Analysis of results , 1995 .

[99]  Spyros Makridakis,et al.  The M3-Competition: results, conclusions and implications , 2000 .

[100]  Yoav Shoham,et al.  A portfolio approach to algorithm select , 2003, IJCAI 2003.

[101]  Teresa Bernarda Ludermir,et al.  Meta-learning approaches to selecting time series models , 2004, Neurocomputing.

[102]  Mark Wallace,et al.  Finding the Right Hybrid Algorithm – A Combinatorial Meta-Problem , 2002, Annals of Mathematics and Artificial Intelligence.

[103]  Ramanathan Gnanadesikan,et al.  Methods for statistical data analysis of multivariate observations , 1977, A Wiley publication in applied statistics.

[104]  C. S. Wallace,et al.  An Information Measure for Classification , 1968, Comput. J..

[105]  João Gama,et al.  Cascade Generalization , 2000, Machine Learning.

[106]  Alexander K. Seewald,et al.  Hybrid Decision Tree Learners with Alternative Leaf Classifiers: An Empirical Study , 2001, FLAIRS Conference.