论文信息 - Algorithm Selection using Reinforcement Learning

Algorithm Selection using Reinforcement Learning

Many computational problems can be solved by multiple algorithms, with different algorithms fastest for different problem sizes, input distributions, and hardware characteristics. We consider the problem of algorithm selection: dynamically choose an algorithm to attack an instance of a problem with the goal of minimizing the overall execution time. We formulate the problem as a kind of Markov decision process (MDP), and use ideas from reinforcement learning to solve it. This paper introduces a kind of MDP that models the algorithm selection problem by allowing multiple state transitions. The well known Q-learning algorithm is adapted for this case in a way that combines both Monte-Carlo and Temporal Difference methods. Also, this work uses, and extends in a way to control problems, the Least-Squares Temporal Difference algorithm (LSTD(0)) of Boyan. The experimental study focuses on the classic problems of order statistic selection and sorting. The encouraging results reveal the potential of applying learning methods to traditional computational problems.

Michail G. Lagoudakis | Michael L. Littman | M. Littman | M. Lagoudakis

[1] Andrew W. Moore,et al. Generalization in Reinforcement Learning: Safely Approximating the Value Function , 1994, NIPS.

[2] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[3] Michel Lemaître,et al. Branch and Bound Algorithm Selection by Performance Prediction , 1998, AAAI/IAAI.

[4] Eugene Fink,et al. How to Solve It Automatically: Selection Among Problem Solving Methods , 1998, AIPS.

[5] Justin A. Boyan,et al. Least-Squares Temporal Difference Learning , 1999, ICML.

[6] Michail G. Lagoudakis,et al. Learning to Select Branching Rules in the DPLL Procedure for Satisfiability , 2001, Electron. Notes Discret. Math..