This paper addresses the search control problem of selecting which plan to refine next for decision-theoretic planners, a choice point common to the decision theoretic planners created to date. Such planners can make use of a utility function to calculate bounds on the expected utility of an abstract plan. Three strategies for using these bounds to select the next plan to refine have been proposed in the literature. We examine the rationale for each strategy and prove that the optimistic strategy of always selecting a plan with the highest upper-bound on expected utility expands the fewest number of plans, when looking for all plans with the highest expected utility. When looking for a single plan with the highest expected utility, we prove that the optimistic strategy has the best possible worst case performance and that other strategies can fail to terminate. To demonstrate the effect of plan selection strategies on performance, we give results using the DRWS planner that show that the optimistic strategy can produce exponential improvements in time and space.
[1]
R. Sugden,et al.
Some implications of a more general form of regret theory
,
1987
.
[2]
Richard Goodwin.
Meta-Level Control for Decision-Theoretic Planners
,
1996
.
[3]
Peter Haddawy,et al.
Abstracting Probabilistic Actions
,
1994,
UAI.
[4]
Steve Hanks,et al.
Optimal Planning with a Goal-directed Utility Model
,
1994,
AIPS.
[5]
Richard Goodwin.
Using Loops in Decision-Theoretic Refinement Planners
,
1996,
AIPS.
[6]
P. Haddawy,et al.
Decision-theoretic Refinement Planning in Medical Decision Making
,
1996,
Medical decision making : an international journal of the Society for Medical Decision Making.
[7]
R. Sugden,et al.
Regret Theory: An alternative theory of rational choice under uncertainty Review of Economic Studies
,
1982
.