Learning Inadmissible Heuristics During Search

Suboptimal search algorithms offer shorter solving times by sacrificing guaranteed solution optimality. While optimal search algorithms like A* and IDA* require admissible heuristics, suboptimal search algorithms need not constrain their guidance in this way. Previous work has explored using off-line training to transform admissible heuristics into more effective inadmissible ones. In this paper we demonstrate that this transformation can be performed on-line, during search. In addition to not requiring training instances and extensive pre-computation, an on-line approach allows the learned heuristic to be tailored to a specific problem instance. We evaluate our techniques in four different benchmark domains using both greedy best-first search and bounded suboptimal search. We find that heuristics learned on-line result in both faster search and better solutions while relying only on information readily available in any best-first search.

[1]  Michael Fink,et al.  Online Learning of Search Heuristics , 2007, AISTATS.

[2]  Sebastian Thrun,et al.  ARA*: Anytime A* with Provable Bounds on Sub-Optimality , 2003, NIPS.

[3]  Nathan R. Sturtevant,et al.  Simultaneously Searching with Multiple Settings: An Alternative to Parameter Tuning for Suboptimal Single-Agent Search Algorithms , 2010, SOCS.

[4]  Richard E. Korf,et al.  A Unified Theory of Heuristic Evaluation Functions and its Application to Learning , 1986, AAAI.

[5]  Wheeler Ruml,et al.  Faster than Weighted A*: An Optimistic Approach to Bounded Suboptimal Search , 2008, ICAPS.

[6]  Richard E. Korf,et al.  Disjoint pattern database heuristics , 2002, Artif. Intell..

[7]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .

[8]  Sandra Zilles,et al.  Bootstrap Learning of Heuristic Functions , 2010, SOCS.

[9]  Richard E. Korf,et al.  Iterative-Deepening-A*: An Optimal Admissible Tree Search , 1985, IJCAI.

[10]  Ira Pohl,et al.  The Avoidance of (Relative) Catastrophe, Heuristic Competence, Genuine Dynamic Weighting and Computational Issues in Heuristic Problem Solving , 1973, IJCAI.

[11]  Nils J. Nilsson,et al.  Artificial Intelligence: A New Synthesis , 1997 .

[12]  Subbarao Kambhampati,et al.  G-Value Plateaus: A Challenge for Planning , 2010, ICAPS.

[13]  Jonathan Schaeffer,et al.  Learning from Multiple Heuristics , 2008, AAAI.

[14]  J. Doran,et al.  Experiments with the Graph Traverser program , 1966, Proceedings of the Royal Society of London. Series A. Mathematical and Physical Sciences.

[15]  Henry W. Davis,et al.  The Statistical Learning of Accurate Heuristics , 1993, IJCAI.

[16]  Richard S. Sutton,et al.  Learning to predict by the methods of temporal differences , 1988, Machine Learning.

[17]  Nils J. Nilsson,et al.  A Formal Basis for the Heuristic Determination of Minimum Cost Paths , 1968, IEEE Trans. Syst. Sci. Cybern..

[18]  Wheeler Ruml,et al.  Using Distance Estimates in Heuristic Search , 2009, ICAPS.

[19]  Malte Helmert,et al.  The More, the Merrier: Combining Heuristic Estimators for Satisficing Planning , 2010, ICAPS.