ion: Some problems naturally divide into two or more hierarchical levels; e.g., in traditional VLSI design, place-then-route. Although the true objective function is only de ned over fully instantiated solutions (at the lowest level), learned evaluation functions can provide an accurate heuristic to guide search at higher levels. Transfer: Evaluation functions de ned over a small set of high-level state-space \features" may readily be transferred|i.e., built from a training set of instances, and then applied quickly to novel instances in any of the ways described above. How can useful evaluation functions be learned automatically, through only trial-and-error simulations of the heuristic? In most cases, what is desired of the evaluation function is that it provide an assessment of the long-range utility of searching from a given state. Tools for exactly this problem are being developed in the reinforcement learning community under the rubric of \value function approximation" [2]. Alternatives to value function approximation include learning from \rollouts" (e.g., [1]) and treating the evaluation function weights as parameters to \meta-optimize" (e.g., [6]), as described above in the section on algorithm tuning. Our survey includes summaries of ve studies on learning evaluation functions for optimization: McGovern, Moss, and Barto learn an evaluation function for move selection in the domain of optimizing compiled machine code, comparing a reinforcement-learning-based scheduler with one based on rollouts. Boyan and Moore use reinforcement learning to build a secondary evaluation function for smart restarting. Their \STAGE" system alternately guides search with the learned evaluation function and the original objective function. Moll, Perkins, and Barto apply an algorithm similar to STAGE to the NP-hard \dial-a-ride" problem (DARP). The learned function is instance-independent, so it applies quickly and e ectively to new DARP instances. Su, Buntine, Newton, and Peters learn a \state-action" evaluation function that allows eÆcient move sampling. They report impressive results in the domain of VLSI Standard Cell Placement. Wolpert and Tumer give a principled method for decomposing a global objective function into a collection of localized objective functions, for use by independent computational agents. The approach is demonstrated on the domain of packet routing. (Also see Boese's abstract for other results on multi-agent optimization.) Finally, since the techniques of reinforcement learning are so relevant to this line of research, we include summaries of two contributions that do not deal directly with large-scale optimization, but rather advance the state of the art in large-scale reinforcement learning: Wang and Dietterich summarize the types of models that have been used for value function approximation, and introduce a promising new model based on regression trees. Dean, Kim, and Hazlehurst describe an innovative, compact representation for large-scale sparse matrix operations, with application to eÆcient value function approximation. Neural Computing Surveys 3, 1{58, 2000, http://www.icsi.berkeley.edu/~jagota/NCS 5 It is our hope that these 14 summaries, taken together, provide a coherent overview of some of the rst steps in applying machine learning to large-scale optimization. Numerous open yet manageable research problems remain unexplored, paving the way for rapid progress in this area. Moreover, the improvements that result from the maturation of this research are not merely of academic interest, but can deliver signi cant gains to computer-aided design, supply-chain optimization, genomics, drug design, and many other realms of enormous economic and scienti c importance.
[1]
Steven Minton,et al.
Automatically configuring constraint satisfaction programs: A case study
,
1996,
Constraints.
[2]
A. Juels,et al.
Topics in black-box combinatorial optimization
,
1996
.
[3]
Martin Pelikan,et al.
Hill Climbing with Learning (An Abstraction of Genetic Algorithm)
,
1995
.
[4]
Russell Impagliazzo,et al.
Towards an analysis of local optimization algorithms
,
1996,
STOC '96.
[5]
Peter Norvig,et al.
Artificial Intelligence: A Modern Approach
,
1995
.
[6]
Gilbert Syswerda,et al.
Simulated Crossover in Genetic Algorithms
,
1992,
FOGA.
[7]
Judea Pearl,et al.
Evidential Reasoning Using Stochastic Simulation of Causal Models
,
1987,
Artif. Intell..
[8]
Judea Pearl,et al.
Probabilistic reasoning in intelligent systems - networks of plausible inference
,
1991,
Morgan Kaufmann series in representation and reasoning.
[9]
Umesh V. Vazirani,et al.
"Go with the winners" algorithms
,
1994,
Proceedings 35th Annual Symposium on Foundations of Computer Science.
[10]
Bart Selman,et al.
Local search strategies for satisfiability testing
,
1993,
Cliques, Coloring, and Satisfiability.
[11]
Diane J. Cook,et al.
A Comparison of Multithreading Implementations
,
1998
.
[12]
Mark Jerrum,et al.
Conductance and the rapid mixing property for Markov chains: the approximation of permanent resolved
,
1988,
STOC '88.
[13]
Richard E. Korf,et al.
Single-Agent Parallel Window Search
,
1991,
IEEE Trans. Pattern Anal. Mach. Intell..
[14]
Mark Jerrum,et al.
Simulated annealing for graph bisection
,
1993,
Proceedings of 1993 IEEE 34th Annual Foundations of Computer Science.
[15]
Ellen W. Zegura,et al.
An architecture for active networking
,
1997,
HPN.
[16]
Kai Hwang,et al.
Load balancing methods for message-passing multicomputers
,
1990
.