Traditionally, planning involves a single agent for which a planner needs to find a sequence of actions that can transform some initial state into some state where a given goal statement is satisfied. A good example of such a problem is how to solve Rubik’s cube. The initial state is some configuration of the cube, and we need to find a sequence of rotations such that every tile on each side have the same color. Even though this problem is hard, the planning agent has full control over the situation. The outcome of a rotation action is completely known. Real-world planning problems, however, seldom conform to this simple domain model. There may be uncontrollable actions of other agents in the domain interfering with the actions applied by the planning agent. Such uncertainty can be modeled by nondeterminism where actions may have several possible outcomes. One approach is to assume that transition probabilities are known and produce plans with a high likelihood to succeed (e.g., [13, 8]). The scalability of such planners, however, is limited due to the overhead of reasoning about probabilities. In addition, it may be hard to gather enough statistical data to estimate the transition probabilities. In this chapter, we consider a simpler model of nondeterminism without transition probabilities. The effect of a nondeterministic action is given as a set of possible next states. Recently, efficient planners have been developed for this class of nondeterministic domains (e.g., [3, 11]). These planners represents states and perform search implicitly in a space of Boolean functions represented efficiently with reduced Ordered Binary Decision Diagrams (OBDDs) [1]. The plans produced by these planners are encoded compactly with OBDDs and correspond to universal plans [19] or policies in Reinforcement Learning [14]. Hence, a nondeterministic plan is a state-action table mapping from states to actions relevant to execute in the state in order to reach a set of goal states. A plan is executed
[1]
Allen Newell,et al.
GPS, a program that simulates human thought
,
1995
.
[2]
Michael P. Georgeff,et al.
Communication and interaction in multi-agent planning
,
1983,
AAAI 1983.
[3]
Jesfis Peral,et al.
Heuristics -- intelligent search strategies for computer problem solving
,
1984
.
[4]
Marco Pistore,et al.
Weak, strong, and strong cyclic planning via symbolic model checking
,
2003,
Artif. Intell..
[5]
Jaime G. Carbonell,et al.
Counterplanning: A Strategy-Based Model of Adversary Planning in Real-World Situations
,
1981,
Artif. Intell..
[6]
Thomas A. Henzinger,et al.
Concurrent reachability games
,
2007,
Theor. Comput. Sci..
[7]
E BryantRandal.
Graph-Based Algorithms for Boolean Function Manipulation
,
1986
.
[8]
Edmund H. Durfee,et al.
Coordination of distributed problem solvers
,
1988
.
[9]
Marcel Schoppers,et al.
Universal Plans for Reactive Robots in Unpredictable Environments
,
1987,
IJCAI.
[10]
Nicholas Kushmerick,et al.
An Algorithm for Probabilistic Planning
,
1995,
Artif. Intell..
[11]
Michael L. Littman,et al.
Markov Games as a Framework for Multi-Agent Reinforcement Learning
,
1994,
ICML.
[12]
Ariel Rubinstein,et al.
A Course in Game Theory
,
1995
.
[13]
Nicholas R. Jennings,et al.
Controlling Cooperative Problem Solving in Industrial Multi-Agent Systems Using Joint Intentions
,
1995,
Artif. Intell..
[14]
Randal E. Bryant,et al.
Graph-Based Algorithms for Boolean Function Manipulation
,
1986,
IEEE Transactions on Computers.
[15]
D. Fudenberg,et al.
The Theory of Learning in Games
,
1998
.
[16]
Manuela M. Veloso,et al.
Guided Symbolic Universal Planning
,
2003,
ICAPS.
[17]
Jeffrey S. Rosenschein,et al.
Incomplete Information and Deception in Multi-Agent Negotiation
,
1991,
IJCAI.
[18]
Peter Haddawy,et al.
Decision-theoretic Refinement Planning Using Inheritance Abstraction
,
1994,
AIPS.