Learning Evaluation Functions to Improve Optimization by Local Search

This paper describes algorithms that learn to improve search performance on large-scale optimization tasks. The main algorithm, STAGE, works by learning an evaluation function that predicts the outcome of a local search algorithm, such as hillclimbing or Walksat, from features of states visited during search. The learned evaluation function is then used to bias future search trajectories toward better optima on the same problem. Another algorithm, X-STAGE, transfers previously learned evaluation functions to new, similar optimization problems. Empirical results are provided on seven large-scale optimization domains: bin-packing, channel routing, Bayesian network structure-finding, radiotherapy treatment planning, cartogram design, Boolean satisfiability, and Boggle board setup.

[1]  Geoffrey E. Hinton,et al.  How Learning Can Guide Evolution , 1996, Complex Syst..

[2]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[3]  D. F. Wong,et al.  Simulated Annealing for VLSI Design , 1988 .

[4]  C. Watkins Learning from delayed rewards , 1989 .

[5]  Long-Ji Lin,et al.  Reinforcement learning for robots using neural networks , 1992 .

[6]  Gerald Tesauro,et al.  Practical Issues in Temporal Difference Learning , 1992, Mach. Learn..

[7]  G. Tesauro Practical Issues in Temporal Difference Learning , 1992 .

[8]  William H. Press,et al.  Numerical Recipes in Fortran 77: The Art of Scientific Computing 2nd Editionn - Volume 1 of Fortran Numerical Recipes , 1992 .

[9]  Alain Delchambre,et al.  A genetic algorithm for bin packing and line balancing , 1992, Proceedings 1992 IEEE International Conference on Robotics and Automation.

[10]  Leslie Pack Kaelbling,et al.  Learning in embedded systems , 1993 .

[11]  Vladimir S. Tikunov,et al.  A New Technique for Constructing Continuous Cartograms , 1993 .

[12]  F. Glover,et al.  In Modern Heuristic Techniques for Combinatorial Problems , 1993 .

[13]  Bart Selman,et al.  Local search strategies for satisfiability testing , 1993, Cliques, Coloring, and Satisfiability.

[14]  Laura A. Sanchis,et al.  Approximately solving Maximum Clique using neural network and related heuristics , 1993, Cliques, Coloring, and Satisfiability.

[15]  C. Reeves Modern heuristic techniques for combinatorial problems , 1993 .

[16]  Umesh V. Vazirani,et al.  "Go with the winners" algorithms , 1994, Proceedings 35th Annual Symposium on Foundations of Computer Science.

[17]  Emilian Stephen Ochotta Synthesis of high-performance analog cells in ASTRX/OBLX , 1994 .

[18]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[19]  Andrew B. Kahng,et al.  A new adaptive multi-start technique for combinatorial global optimizations , 1994, Oper. Res. Lett..

[20]  Norman M. Sadeh,et al.  Increasing The Efficiency of Simulated Annealing Search by Learning to Recognize (Un)Promising Runs , 1994, AAAI.

[21]  S. Webb Optimizing the planning of intensity-modulated radiotherapy. , 1994, Physics in medicine and biology.

[22]  Andrew W. Moore,et al.  Memory-based Stochastic Optimization , 1995, NIPS.

[23]  J. Beveridge,et al.  Local Search as a Tool for Horizon Line Matching. , 1995 .

[24]  Kenneth D. Boese,et al.  Cost Versus Distance In the Traveling Salesman Problem , 1995 .

[25]  Wei Zhang,et al.  A Reinforcement Learning Approach to job-shop Scheduling , 1995, IJCAI.

[26]  Jonathan Cagan,et al.  A Simulated Annealing-Based Approach to Three-Dimensional Component Packing , 1995 .

[27]  Martin Wattenberg,et al.  Stochastic Hillclimbing as a Baseline Mathod for Evaluating Genetic Algorithms , 1995, NIPS.

[28]  Olivier C. Martin,et al.  Combining simulated annealing with local search heuristics , 1993, Ann. Oper. Res..

[29]  Wei Zhang,et al.  Reinforcement learning for job shop scheduling , 1996 .

[30]  Nir Friedman,et al.  On the Sample Complexity of Learning Bayesian Networks , 1996, UAI.

[31]  Mary P. Harper,et al.  An efficient lower bound algorithm for channel routing , 1996, Integr..

[32]  Edward G. Coffman,et al.  Approximation algorithms for bin packing: a survey , 1996 .

[33]  Andrew B. Kahng,et al.  Combining problem reduction and adaptive multistart: a new technique for superior iterative partitioning , 1997, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[34]  Bart Selman,et al.  Evidence for Invariants in Local Search , 1997, AAAI/IAAI.

[35]  Fred W. Glover,et al.  Tabu Search , 1997, Handbook of Heuristics.

[36]  S. Baluja,et al.  Combining Multiple Optimization Runs with Optimal Dependency Trees , 1997 .

[37]  Andrew W. Moore,et al.  Learning evaluation functions for global optimization , 1998 .

[38]  A. Richard Newton,et al.  Learning as applied to stochastic optimization for standard cell placement , 1998, Proceedings International Conference on Computer Design. VLSI in Computers and Processors (Cat. No.98CB36273).

[39]  Andrew W. Moore,et al.  Cached Sufficient Statistics for Efficient Machine Learning with Large Datasets , 1998, J. Artif. Intell. Res..

[40]  Richard S. Sutton,et al.  Learning Instance-Independent Value Functions to Enhance Local Search , 1998, NIPS.

[41]  Andrew W. Moore,et al.  A Nonparametric Approach to Noisy and Costly Optimization , 2000, ICML.

[42]  Kee-Eung Kim,et al.  Statistical Machine Learning for Large-Scale Optimization , 2000 .

[43]  Andrew W. Moore,et al.  Q2: memory-based active learning for optimizing noisy continuous functions , 1998, Proceedings 2000 ICRA. Millennium Conference. IEEE International Conference on Robotics and Automation. Symposia Proceedings (Cat. No.00CH37065).