Learning dynamic algorithm portfolios

Algorithm selection can be performed using a model of runtime distribution, learned during a preliminary training phase. There is a trade-off between the performance of model-based algorithm selection, and the cost of learning the model. In this paper, we treat this trade-off in the context of bandit problems. We propose a fully dynamic and online algorithm selection technique, with no separate training phase: all candidate algorithms are run in parallel, while a model incrementally learns their runtime distributions. A redundant set of time allocators uses the partially trained model to propose machine time shares for the algorithms. A bandit problem solver mixes the model-based shares with a uniform share, gradually increasing the impact of the best time allocators as the model improves. We present experiments with a set of SAT solvers on a mixed SAT-UNSAT benchmark; and with a set of solvers for the Auction Winner Determination problem.

[1]  John Taylor Stallings,et al.  The Search For Satisfaction , 1935 .

[2]  E. Kaplan,et al.  Nonparametric Estimation from Incomplete Observations , 1958 .

[3]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[4]  Wayne Nelson,et al.  Applied life data analysis , 1983 .

[5]  Nichael Lynn Cramer,et al.  A Representation for the Adaptive Generation of Simple Sequential Programs , 1985, ICGA.

[6]  R. Geoff Dromey,et al.  An algorithm for the selection problem , 1986, Softw. Pract. Exp..

[7]  P. W. Jones,et al.  Bandit Problems, Sequential Allocation of Experiments , 1987 .

[8]  Oren Etzioni,et al.  Embedding Decision-Analytic Control in a Learning Architecture , 1991, Artif. Intell..

[9]  Stuart J. Russell,et al.  Principles of Metareasoning , 1989, Artif. Intell..

[10]  Hector J. Levesque,et al.  Hard and Easy Distributions of SAT Problems , 1992, AAAI.

[11]  David Zuckerman,et al.  Optimal Speedup of Las Vegas Algorithms , 1993, Inf. Process. Lett..

[12]  Shlomo Zilberstein,et al.  Anytime Sensing Planning and Action: A Practical Model for Robot Control , 1993, IJCAI.

[13]  Mark S. Boddy,et al.  Deliberation Scheduling for Problem Solving in Time-Constrained Environments , 1994, Artif. Intell..

[14]  Andrew W. Moore,et al.  Efficient Algorithms for Minimizing Cross Validation Error , 1994, ICML.

[15]  Nicolò Cesa-Bianchi,et al.  Gambling in a rigged casino: The adversarial multi-armed bandit problem , 1995, Proceedings of IEEE 36th Annual Foundations of Computer Science.

[16]  Hani Doss,et al.  An Approach to Nonparametric Regression for Life History Data Using Local Linear Fitting , 1995 .

[17]  O. Linton,et al.  Kernel estimation in a nonparametric marker dependent hazard model , 1995 .

[18]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[19]  Edward P. K. Tsang,et al.  Adaptive Constraint Satisfaction: The Quickest First Principle , 1996, ECAI.

[20]  David Robertson,et al.  Proceedings of the 12th European Conference on Artificial Intelligence , 1996 .

[21]  Corso Elvezia Probabilistic Incremental Program Evolution , 1997 .

[22]  Roberto Battiti,et al.  Reactive search, a history-sensitive heuristic for MAX-SAT , 1997, JEAL.

[23]  Chu Min Li,et al.  Heuristics Based on Unit Propagation for Satisfiability Problems , 1997, IJCAI.

[24]  Rafal Salustowicz,et al.  Probabilistic Incremental Program Evolution , 1997, Evolutionary Computation.

[25]  F. Post,et al.  An Economics Approach to Hard Computational Problems , 1997 .

[26]  Tad Hogg,et al.  An Economics Approach to Hard Computational Problems , 1997, Science.

[27]  Fernando G. Lobo,et al.  A parameter-less genetic algorithm , 1999, GECCO.

[28]  Toby Walsh,et al.  The Search for Satisfaction , 1999 .

[29]  Hilan Bensusan,et al.  Meta-Learning by Landmarking Various Learning Algorithms , 2000, ICML.

[30]  Thomas Stützle,et al.  SATLIB: An Online Resource for Research on SAT , 2000 .

[31]  Michail G. Lagoudakis,et al.  Algorithm Selection using Reinforcement Learning , 2000, ICML.

[32]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[33]  Pat Langley,et al.  Proceedings of the Seventeenth International Conference on Machine Learning (ICML 2000), Stanford University, Stanford, CA, USA, June 29 - July 2, 2000 , 2000, ICML 2000.

[34]  M. Akritas,et al.  Estimation of the conditional distribution in regression with censored data: a comparative study , 2001 .

[35]  Luc De Raedt,et al.  Proceedings of the 12th European Conference on Machine Learning , 2001 .

[36]  Bart Selman,et al.  Algorithm portfolios , 2001, Artif. Intell..

[37]  Y. Freund,et al.  The non-stochastic multi-armed bandit problem , 2001 .

[38]  David Maxwell Chickering,et al.  A Bayesian Approach to Tackling Hard Computational Problems (Preliminary Report) , 2001, Electron. Notes Discret. Math..

[39]  Eric Horvitz,et al.  Computational tradeoffs under bounded resources , 2001, Artif. Intell..

[40]  Shlomo Zilberstein,et al.  Monitoring and control of anytime algorithms: A dynamic programming approach , 2001, Artif. Intell..

[41]  Nicolas Barnier,et al.  Solving the Kirkman's schoolgirl problem in a few seconds , 2002 .

[42]  Peter Auer,et al.  The Nonstochastic Multiarmed Bandit Problem , 2002, SIAM J. Comput..

[43]  M. May Bayesian Survival Analysis. , 2002 .

[44]  Yoav Shoham,et al.  Learning the Empirical Hardness of Optimization Problems: The Case of Combinatorial Auctions , 2002, CP.

[45]  Thomas Stützle,et al.  A Racing Algorithm for Configuring Metaheuristics , 2002, GECCO.

[46]  Eric Horvitz,et al.  Dynamic restart policies , 2002, AAAI/IAAI.

[47]  Booncharoen Sirinaovakul,et al.  Introduction to the Special Issue , 2002, Comput. Intell..

[48]  J. van Leeuwen,et al.  Principles and Practice of Constraint Programming - CP 2002 , 2002, Lecture Notes in Computer Science.

[49]  R. Solomonoff Progress In Incremental Machine Learning , 2003 .

[50]  James M. Robins,et al.  Unified Methods for Censored Longitudinal Data and Causality , 2003 .

[51]  Carlos Soares,et al.  A Meta-Learning Method to Select the Kernel Width in Support Vector Regression , 2004, Machine Learning.

[52]  Mark Wallace,et al.  Principles and Practice of Constraint Programming – CP 2004 , 2004, Lecture Notes in Computer Science.

[53]  Marek Petrik Statistically Optimal Combination of Algorithms , 2004 .

[54]  Jürgen Schmidhuber,et al.  Optimal Ordered Problem Solver , 2002, Machine Learning.

[55]  Dino Pedreschi,et al.  Machine Learning: ECML 2004 , 2004, Lecture Notes in Computer Science.

[56]  Jürgen Schmidhuber,et al.  Adaptive Online Time Allocation to Search Algorithms , 2004, ECML.

[57]  Jürgen Schmidhuber,et al.  Shifting Inductive Bias with Success-Story Algorithm, Adaptive Levin Search, and Incremental Self-Improvement , 1997, Machine Learning.

[58]  Bart Selman,et al.  Heavy-Tailed Phenomena in Satisfiability and Constraint Satisfaction Problems , 2000, Journal of Automated Reasoning.

[59]  Thomas Stützle,et al.  Local Search Algorithms for SAT: An Empirical Evaluation , 2000, Journal of Automated Reasoning.

[60]  Ricardo Vilalta,et al.  Introduction to the Special Issue on Meta-Learning , 2004, Machine Learning.

[61]  Thomas Stützle,et al.  Stochastic Local Search: Foundations & Applications , 2004 .

[62]  J. Christopher Beck,et al.  Simple Rules for Low-Knowledge Algorithm Selection , 2004, CPAIOR.

[63]  Yoav Shoham,et al.  Understanding Random SAT: Beyond the Clauses-to-Variables Ratio , 2004, CP.

[64]  Stephen F. Smith,et al.  Heuristic Selection for Stochastic Search Optimization: Modeling Solution Quality by Extreme Value Theory , 2004, CP.

[65]  Stephen F. Smith,et al.  The Max K-Armed Bandit: A New Model of Exploration Applied to Search Heuristic Selection , 2005, AAAI.

[66]  Chu Min Li,et al.  Diversification and Determinism in Local Search for Satisfiability , 2005, SAT.

[67]  Ricardo Vilalta,et al.  A Perspective View and Survey of Meta-Learning , 2002, Artificial Intelligence Review.

[68]  Y. Shoham,et al.  Empirical approach to the complexity of hard problems , 2005 .

[69]  Frank Hutter,et al.  Parameter Adjustment Based on Performance Prediction: Towards an Instance-Aware Problem Solver , 2005 .

[70]  Jürgen Schmidhuber,et al.  A Neural Network Model for Inter-problem Adaptive Online Time Allocation , 2005, ICANN.

[71]  Wayne B. Nelson,et al.  Applied Life Data Analysis: Nelson/Applied Life Data Analysis , 2005 .

[72]  J. Christopher Beck,et al.  APPLYING MACHINE LEARNING TO LOW‐KNOWLEDGE CONTROL OF OPTIMIZATION ALGORITHMS , 2005, Comput. Intell..

[73]  Kevin Leyton-Brown,et al.  Performance Prediction and Automated Tuning of Randomized and Parametric Algorithms , 2006, CP.

[74]  J. Schmidhuber,et al.  Gambling in a Computationally Expensive Casino : Algorithm Selection as a Bandit Problem , 2006 .

[75]  Stephen F. Smith,et al.  An Asymptotically Optimal Algorithm for the Max k-Armed Bandit Problem , 2006, AAAI.

[76]  Jürgen Schmidhuber,et al.  Impact of Censored Sampling on the Performance of Restart Strategies , 2006, CP.

[77]  Hongzhe Li Censored Data Regression in High-Dimension and Low-Sample Size Settings For Genomic Applications , 2006 .

[78]  Marek Petrik,et al.  Learning Static Parallel Portfolios of Algorithms , 2006, ISAIM.

[79]  Jürgen Schmidhuber,et al.  Dynamic Algorithm Portfolios , 2006, AI&M.

[80]  Frédéric Benhamou Principles and Practice of Constraint Programming - CP 2006, 12th International Conference, CP 2006, Nantes, France, September 25-29, 2006, Proceedings , 2006, CP.

[81]  Marek Petrik,et al.  Learning parallel portfolios of algorithms , 2006, Annals of Mathematics and Artificial Intelligence.

[82]  Jürgen Schmidhuber,et al.  Learning Restart Strategies , 2007, IJCAI.

[83]  Laura Wichert,et al.  Application of a Simple Nonparametric Conditional Quantile Function Estimator in Unemployment Duration Analysis , 2007 .

[84]  H. Robbins Some aspects of the sequential design of experiments , 1952 .

[85]  Laura Spierdijk,et al.  Nonparametric conditional hazard rate estimation: A local linear approach , 2008, Comput. Stat. Data Anal..

[86]  Marvin Rausand,et al.  Life Data Analysis , 2008 .

[87]  David W. Hosmer,et al.  Applied Survival Analysis: Regression Modeling of Time-to-Event Data , 2008 .