A Decision Policy for the Routing and Munitions Management of Multiformations of Unmanned Combat Vehicles in Adversarial Urban Environments

The control and management of unmanned combat vehicles (UCVs) operating in an adversarial urban environment is a challenging task due, in part, to the imperfect and incomplete information available, the conflicting objectives of opposing teams, the uncertain stochastic dynamics, and the limitation in computational capability. In this paper, a decision policy built upon Markov decision processes is proposed to provide optimal routing and munitions management despite the conflicting objectives of the adversaries and the stochastic dynamics. The main novelty of the proposed decision policy lies in its handling of multiple UCV formations of varying dimensions. This multiformations capability is explicitly accounted for in the proposed formulation of the optimization problem. The UCVs, which constitute the blue team, have for objective to reach prescribed tactical target locations from a common starting point by following possibly different paths across an adversarial urban environment, within prescribed time windows and with maximum lethality. On their way, the UCVs will face an adversarial red team, which is composed of ground units that can engage any nearby UCV. The rendezvous objective of the blue team can be interpreted as a constraint in an optimization problem, aimed at minimizing damage while maximizing the total number of remaining munitions at the time the multiformations reach the targets. The blue and red teams play the roles of cost-function minimizer and maximizer, respectively. The worst-case minimization objective of the blue team is formulated as a finite-time optimization, which is solved by means of a dynamic programming equation with value function evolving according to a graph of feasible UCV paths. The resulting decision policy takes the form of a lookup table, which is ideal for online implementations. The practical case of imperfect information on the classification and the location of the adversarial ground units is addressed by means of a one-step lookahead rollout policy using estimates provided by a recursive Bayesian filter. Simulation results show that the concept of multiformations provides, on average, an improvement in performance when compared with single-formation routing.

[1]  Laurent El Ghaoui,et al.  Robust Control of Markov Decision Processes with Uncertain Transition Matrices , 2005, Oper. Res..

[2]  Tamer Başar,et al.  H1-Optimal Control and Related Minimax Design Problems , 1995 .

[3]  Dimitris Bertsimas,et al.  Stochastic and Dynamic Vehicle Routing in the Euclidean Plane with Multiple Capacitated Vehicles , 1993, Oper. Res..

[4]  Georgios C. Chasparis,et al.  Linear-programming-based multi-vehicle path planning with adversaries , 2005, Proceedings of the 2005, American Control Conference, 2005..

[5]  J. B. Cruz,et al.  Moving horizon Nash strategies for a military air operation , 2002 .

[6]  T. Başar,et al.  Dynamic Noncooperative Game Theory, 2nd Edition , 1998 .

[7]  F. Bullo,et al.  Decentralized algorithms for vehicle routing in a stochastic time-varying environment , 2004, 2004 43rd IEEE Conference on Decision and Control (CDC) (IEEE Cat. No.04CH37601).

[8]  Jonathan P. How,et al.  Aircraft trajectory planning with collision avoidance using mixed integer linear programming , 2002, Proceedings of the 2002 American Control Conference (IEEE Cat. No.CH37301).

[9]  J. Shamma,et al.  Belief consensus and distributed hypothesis testing in sensor networks , 2006 .

[10]  Eugene A. Feinberg,et al.  Handbook of Markov Decision Processes , 2002 .

[11]  M.A. Simaan,et al.  Effectiveness of the Nash strategies in competitive multi-team target assignment problems , 2007, IEEE Transactions on Aerospace and Electronic Systems.

[12]  William M. McEneaney,et al.  Robust Limits of Risk Sensitive Nonlinear Filters , 2001, Math. Control. Signals Syst..

[13]  Debasish Ghose,et al.  Modeling and analysis of air campaign resource allocation: a spatio-temporal decomposition approach , 2002, IEEE Trans. Syst. Man Cybern. Part A.

[14]  Marios M. Polycarpou,et al.  Stochastic Models of a Cooperative Autonomous UAV Search Problem , 2003 .

[15]  Reza Olfati-Saber,et al.  Flocking for multi-agent dynamic systems: algorithms and theory , 2006, IEEE Transactions on Automatic Control.

[16]  Dimitri P. Bertsekas,et al.  Dynamic Programming and Optimal Control, Two Volume Set , 1995 .

[17]  Sonia Martínez,et al.  Robust rendezvous for mobile autonomous agents via proximity graphs in arbitrary dimensions , 2006, IEEE Transactions on Automatic Control.

[18]  Rajdeep Singh,et al.  Unmanned Vehicle Operations: Countering Imperfect Information in an Adversarial Environment , 2004 .

[19]  Atilla Dogan,et al.  Probabilistic Path Planning for UAVs , 2003 .

[20]  Rajdeep Singh,et al.  Unmanned Vehicle Operations Under Imperfect Information in an Adversarial Environment II , 2005 .

[21]  Reza Olfati-Saber,et al.  Consensus and Cooperation in Networked Multi-Agent Systems , 2007, Proceedings of the IEEE.

[22]  Jason R. Marden,et al.  Autonomous Vehicle-Target Assignment: A Game-Theoretical Formulation , 2007 .

[23]  J.P. How,et al.  Cooperative task assignment of unmanned aerial vehicles in adversarial environments , 2005, Proceedings of the 2005, American Control Conference, 2005..

[24]  R. Mahler Objective Functions for Bayesian Control-Theoretic Sensor Management, II: MНC-Like Approximation , 2004 .

[25]  W.M. McEneaney,et al.  Stochastic game approach to air operations , 2004, IEEE Transactions on Aerospace and Electronic Systems.

[26]  Timothy W. McLain,et al.  Coordinated target assignment and intercept for unmanned air vehicles , 2002, Proceedings 2002 IEEE International Conference on Robotics and Automation (Cat. No.02CH37292).

[27]  Panos M. Pardalos,et al.  Use of Conditional Value-at-Risk in Stochastic Programs with Poorly Defined Distributions , 2004 .

[28]  Ming Li,et al.  Game-theoretic modeling and control of a military air operation , 2001 .

[29]  Rajdeep Singh,et al.  Deception in Autonomous Vehicle Decision Making in an Adversarial Environment , 2005 .