Repairing Decision-Theoretic Policies Using Goal-Oriented Planning

In this paper we address the problem of how decision-theoretic policies can be repaired. This work is motivated by observations made in robotic soccer where decision-theoretic policies become invalid due to small deviations during execution; and repairing might pay off compared to re-planning from scratch. Our policies are generated with Readylog , a derivative of Golog based on the situation calculus, which combines programming and planning for agents in dynamic domains. When an invalid policy is detected, the world state is transformed into a pddl description and a state-of-the-art pddl planner is deployed to calculate the repair plan.

[1]  Dimitri P. Bertsekas,et al.  Dynamic Programming: Deterministic and Stochastic Models , 1987 .

[2]  Alexander Ferrein,et al.  On-Line Decision-Theoretic Golog for Unpredictable Domains , 2004, KI.

[3]  Gerhard Lakemeyer,et al.  Towards an Integration of Planning and Golog , 2007, IJCAI 2007.

[4]  Alexander Ferrein,et al.  Approaching A Formal Soccer Theory FromBehaviour Specifi Cations In Robotic Soccer , 2008 .

[5]  Henrik Grosskreutz,et al.  Probabilistic Projection and Belief Update in the pGOLOG Framework , 2000, GI Jahrestagung.

[6]  Maria Fox,et al.  PDDL2.1: An Extension to PDDL for Expressing Temporal Planning Domains , 2003, J. Artif. Intell. Res..

[7]  Craig A. Knoblock,et al.  PDDL-the planning domain definition language , 1998 .

[8]  J. McCarthy Situations, Actions, and Causal Laws , 1963 .

[9]  Gerhard Lakemeyer,et al.  ccGolog -- A Logical Language Dealing with Continuous Change , 2003, Log. J. IGPL.

[10]  Alexander Ferrein,et al.  Using Golog for Deliberation and Team Coordination in Robotic Soccer , 2005, Künstliche Intell..

[11]  Giuseppe De Giacomo,et al.  Execution Monitoring of High-Level Robot Programs , 1998, KR.

[12]  Sean R Eddy,et al.  What is dynamic programming? , 2004, Nature Biotechnology.

[13]  Hector J. Levesque,et al.  Incremental execution of guarded theories , 2001, ACM Trans. Comput. Log..

[14]  Günther Palm,et al.  KI 2004: Advances in Artificial Intelligence , 2004, Lecture Notes in Computer Science.

[15]  Michael Beetz Structured Reactive Controllers , 2004, Autonomous Agents and Multi-Agent Systems.

[16]  Leslie Pack Kaelbling,et al.  Planning under Time Constraints in Stochastic Domains , 1993, Artif. Intell..

[17]  Alexander Ferrein,et al.  Options in Readylog Reloaded - Generating Decision-Theoretic Plan Libraries in Golog , 2007, KI.

[18]  Gerhard Lakemeyer,et al.  Towards an Integration of Golog and Planning , 2007, IJCAI.

[19]  Craig Boutilier,et al.  Decision-Theoretic, High-Level Agent Programming in the Situation Calculus , 2000, AAAI/IAAI.

[20]  Tonya Lewis,et al.  Knowledge in Action , 1977 .

[21]  Stefan Edelkamp,et al.  Taming Numbers and Durations in the Model Checking Integrated Planning System , 2003, PuK.

[22]  John G. Gibbons Knowledge in Action , 2001 .

[23]  Hector J. Levesque,et al.  An Incremental Interpreter for High-Level Programs with Sensing , 1999 .

[24]  Hector J. Levesque,et al.  ConGolog, a concurrent programming language based on the situation calculus , 2000, Artif. Intell..

[25]  Yixin Chen,et al.  SGPlan: Subgoal Partitioning and Resolution in Planning , 2004 .

[26]  Derek Long,et al.  Plan Constraints and Preferences in PDDL3 , 2006 .

[27]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[28]  Hector J. Levesque,et al.  GOLOG: A Logic Programming Language for Dynamic Domains , 1997, J. Log. Program..

[29]  Bernhard Nebel,et al.  Plan Reuse Versus Plan Generation: A Theoretical and Empirical Analysis , 1995, Artif. Intell..

[30]  Alexander Ferrein,et al.  Specifying multirobot coordination in ICPGolog from simulation towards real robots , 2003 .

[31]  Alexander Ferrein,et al.  Robot Controllers for Highly Dynamic Environments with Real-time Constraints , 2010, KI - Künstliche Intelligenz.