Learning rewrite rules to improve plan quality

Considerable planning and learning research has been devoted to the problem of learning domain specific search control rules to improve planning efficiency. There have also been a few attempts to learn search control rules that improve plan quality but such efforts have been limited to state-space planners. The reason being that most of the newer planning approaches are based on plan refinement. In such planners, information about the current state of the world that is required to evaluate a complex quality metric is simply not available during planning. An alternative technique is planning by rewritingthat suggests first generating an initial plan using a refinement planner and then using a set of rewrite-rules to transform it into a higher quality plan (Ambite ~ Knoblock 1997). Unlike the search control rules that are defined on the space of partial plans, rewrite rules are defined on the space of complete plans. This paper presents a system called REWRITE that automatically learns rewrite rules. REWRITE has three main components. The first is a partial-order causal-link planner (POP). The second component does the analytic work of identifying the replacing and to-be-replaced action sequences. The third component is a case library of plan-rewrite rules. The input to REWRITE’s analytic component is (a) a problem described by an initial state and goals (b) the plan and planning trace produced by the partial order planner for this problem, and (c) a "better plan" for the same problem. The better plan is the one that has a higher quality rating than the one produced by the underlying partial-order planner, as per the quality function that assesses how resources are impacted by each plan. This better plan might be provided by some oracle, by a user, or by some other planner. REWRITE’s analytic component first reconstructs a set of causal link relationships between the steps in the better plan and a set of required ordering constraints. The second step is to retrace POP’s planning-trace, looking for plan-refinement decisions that added a constraint that is not present in the better plan’s constraint set. We call such a decision point a conflicting choice