论文信息 - Robust, Goal-directed Plan Execution with Bounded Risk

Robust, Goal-directed Plan Execution with Bounded Risk

There is an increasing need for robust optimal plan execution for multi-agent systems in uncertain environments, while guaranteeing an acceptable probability of success. For example, a fleet of unmanned aerial vehicles (UAVs) and autonomous underwater vehicles (AUVs) are required to operate autonomously for an extensive mission duration in an uncertain environment. Previous work introduced the concept of a model-based executive, which increases the level of autonomy, elevating the level at which systems are commanded. This thesis develops model-based executives that reason explicitly from a stochastic plant model to find the optimal course of action, while ensuring that the probability of failure is within a user-specified risk bound. This thesis presents two robust mode-based executives: probabilistic Sulu or p-Sulu, and distributed probabilistic Sulu or dp-Sulu. The objective for p-Sulu and dp-Sulu is to allow users to command continuous, stochastic multi-agent systems in a manner that is both intuitive and safe. The user specifies the desired evolution of the plant state, as well as the acceptable probabilities of failure, as a temporal plan on states called a chanceconstrained qualitative state plan (CCQSP). An example of a CCQSP statement is "go to A through B within 30 minutes, with less than 0.001% probability of failure." p-Sulu and dp-Sulu take a CCQSP, a continuous plant model with stochastic uncertainty, and an objective function as inputs, and outputs an optimal continuous control sequence, as well as an optimal discrete schedule. The difference between p-Sulu and dp-Sulu is that p-Sulu plans in a centralized manner while dp-Sulu plans in a distributed manner. dp-Sulu enables robust CCQSP execution for multi-agent systems. We solve the problem based on the key concept of risk allocation, which achieves tractability by allocating the specified risk to individual constraints and mapping the result into an equivalent deterministic constrained optimization problem. Risk allocation also enables a distributed plan execution for multi-agent systems by distributing the risk among agents to decompose the optimization problem. Building upon the risk allocation approach, we develop our first CCQSP executive, p-Sulu, in four spirals. First, we develop the Convex Risk Allocation (CRA) algorithm, which can solve a CCQSP planning problem with a convex state space and a fixed schedule, highlighting the capability of optimally allocating risk to individual constraints. Second, we develop the Non-convex Iterative Risk Allocation (NIRA) algorithm, which can handle non-convex state space. Third, we build upon NIRA a full-horizon CCQSP planner, p-Sulu FH, which can optimize not only the control sequence but also the schedule. Fourth, we develop p-Sulu, which enables the real-time execution of CCQSPs by employing the receding horizon approach. Our second CCQSP executive, dp-Sulu, is developed in two spirals. First, we develop the Market-based Iterative Risk Allocation (MIRA) algorithm, which can control a multiagent system in a distributed manner by optimally distributing risk among agents through the market-based method called tatonnement. Second and finally, we integrate the capability of MIRA into p-Sulu to build the robust model-based executive, dp-Sulu, which can execute CCQSPs on multi-agent systems in a distributed manner. Our simulation results demonstrate that our executives can efficiently execute CCQSP planning problems with significantly reduced suboptimality compared to prior art. Thesis Supervisor: Prof. Brian C. Williams

Masahiro Ono | M. Ono

[1] Maria Fox,et al. PDDL2.1: An Extension to PDDL for Expressing Temporal Planning Domains , 2003, J. Artif. Intell. Res..

[2] Malik Ghallab,et al. Dealing with Uncertain Durations In Temporal Constraint Networks dedicated to Planning , 1996, ECAI.

[3] Egon Balas. Disjunctive Programming , 2010, 50 Years of Integer Programming.

[4] Johan Löfberg,et al. YALMIP : a toolbox for modeling and optimization in MATLAB , 2004 .

[5] S. Ploen,et al. A survey of spacecraft formation flying guidance and control (part 1): guidance , 2003, Proceedings of the 2003 American Control Conference, 2003..

[6] David Q. Mayne,et al. Constrained model predictive control: Stability and optimality , 2000, Autom..

[7] Robert T. Effinger,et al. Enabling Fast Flexible Planning through Incremental Temporal Reasoning with Conflict Extraction , 2005, ICAPS.

[8] Richard M. Murray,et al. Recent Research in Cooperative Control of Multivehicle Systems , 2007 .

[9] Naum Zuselevich Shor,et al. Minimization Methods for Non-Differentiable Functions , 1985, Springer Series in Computational Mathematics.

[10] B. Moor,et al. Mixed integer programming for multi-vehicle path planning , 2001, 2001 European Control Conference (ECC).

[11] A. Richards,et al. Decentralized model predictive control of cooperating UAVs , 2004, 2004 43rd IEEE Conference on Decision and Control (CDC) (IEEE Cat. No.04CH37601).

[12] Patrick Doherty,et al. TALplanner: A temporal logic based forward chaining planner , 2001, Annals of Mathematics and Artificial Intelligence.

[13] C. Scherer,et al. A game theoretic approach to nonlinear robust receding horizon control of constrained systems , 1997, Proceedings of the 1997 American Control Conference (Cat. No.97CH36041).

[14] L. Blackmore,et al. Convex Chance Constrained Predictive Control without Sampling , 2009 .

[15] Stanley Peters,et al. Collaborative activities and multi-tasking in dialogue systems , 2002 .

[16] J. How,et al. Chance Constrained RRT for Probabilistic Robustness to Environmental Uncertainty , 2010 .

[17] Manfred Morari,et al. Model predictive control: Theory and practice - A survey , 1989, Autom..

[18] A. Richards,et al. Robust Receding Horizon Control using Generalized Constraint Tightening , 2007, 2007 American Control Conference.

[19] Behçet Açikmese,et al. A nonlinear model predictive control algorithm with proven robustness and resolvability , 2006, 2006 American Control Conference.

[20] F.Y. Hadaegh,et al. A survey of spacecraft formation flying guidance and control. Part II: control , 2004, Proceedings of the 2004 American Control Conference.

[21] Mark Campbell,et al. Collision avoidance in satellite clusters , 2002, Proceedings of the 2002 American Control Conference (IEEE Cat. No.CH37301).

[22] Stephen P. Boyd,et al. Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[23] Fahiem Bacchus,et al. Planning for temporally extended goals , 1996, Annals of Mathematics and Artificial Intelligence.

[24] Maria Fox,et al. Modelling Mixed Discrete-Continuous Domains for Planning , 2006, J. Artif. Intell. Res..

[25] Zhengzhu Feng,et al. Dynamic Programming for Structured Continuous Markov Decision Problems , 2004, UAI.

[26] Thomas A. Henzinger,et al. The benefits of relaxing punctuality , 1991, JACM.

[27] H. Voos,et al. Agent-Based Distributed Resource Allocation in Technical Dynamic Systems , 2006, IEEE Workshop on Distributed Intelligent Systems: Collective Intelligence and Its Applications (DIS'06).

[28] Robert T. Effinger,et al. Optimal temporal planning at reactive time scales via dynamic backtracking branch and bound , 2006 .

[29] Brian C. Williams,et al. Coordinating Agile Systems through the Model-based Execution of Temporal Plans , 2005, AAAI.

[30] Jonathan P. How,et al. Safe Trajectories for Autonomous Rendezvous of Spacecraft , 2006 .

[31] S. Shankar Sastry,et al. Nonlinear model predictive tracking control for rotorcraft-based unmanned aerial vehicles , 2002, Proceedings of the 2002 American Control Conference (IEEE Cat. No.CH37301).

[32] Rob Sherwood,et al. An autonomous Earth observing sensorweb , 2005, 2005 IEEE International Conference on Systems, Man and Cybernetics.

[33] J. Shields,et al. Metrology sensor characterization and pointing control for the formation interferometer testbed (FIT) , 2002, Proceedings, IEEE Aerospace Conference.

[34] J. Junkins,et al. Analytical Mechanics of Space Systems , 2003 .

[35] Stanley Peters,et al. A multi-modal dialogue system for human-robot conversation , 2001, HTL 2001.

[36] Michael P. Wellman. A Market-Oriented Programming Environment and its Application to Distributed Multicommodity Flow Problems , 1993, J. Artif. Intell. Res..

[37] J. Löfberg. Minimax approaches to robust model predictive control , 2003 .

[38] Michael L. Littman,et al. Exact Solutions to Time-Dependent MDPs , 2000, NIPS.

[39] D. Vallado. Fundamentals of Astrodynamics and Applications , 1997 .

[40] James R. Wertz,et al. Space Mission Analysis and Design , 1992 .