论文信息 - Efficient Iterative Linear-Quadratic Approximations for Nonlinear Multi-Player General-Sum Differential Games

Efficient Iterative Linear-Quadratic Approximations for Nonlinear Multi-Player General-Sum Differential Games

Many problems in robotics involve multiple decision making agents. To operate efficiently in such settings, a robot must reason about the impact of its decisions on the behavior of other agents. Differential games offer an expressive theoretical framework for formulating these types of multi-agent problems. Unfortunately, most numerical solution techniques scale poorly with state dimension and are rarely used in real-time applications. For this reason, it is common to predict the future decisions of other agents and solve the resulting decoupled, i.e., single-agent, optimal control problem. This decoupling neglects the underlying interactive nature of the problem; however, efficient solution techniques do exist for broad classes of optimal control problems. We take inspiration from one such technique, the iterative linear-quadratic regulator (ILQR), which solves repeated approximations with linear dynamics and quadratic costs. Similarly, our proposed algorithm solves repeated linear-quadratic games. We experimentally benchmark our algorithm in several examples with a variety of initial conditions and show that the resulting strategies exhibit complex interactive behavior. Our results indicate that our algorithm converges reliably and runs in real-time. In a three-player, 14-state simulated intersection problem, our algorithm initially converges in < 0.25 s. Receding horizon invocations converge in < 50 ms in a hardware collision-avoidance test.

[1] Eloy Garcia,et al. Design and Analysis of State-Feedback Optimal Strategies for the Differential Game of Active Defense , 2019, IEEE Transactions on Automatic Control.

[2] S. Shankar Sastry,et al. On the Characterization of Local Nash Equilibria in Continuous Games , 2014, IEEE Transactions on Automatic Control.

[3] T.-Y. Li,et al. Lyapunov Iterations for Solving Coupled Algebraic Riccati Equations of Nash Differential Games and Algebraic Riccati Equations of Zero-Sum Games , 1995 .

[4] Zhengyuan Zhou,et al. Cooperative pursuit with Voronoi partitions , 2016, Autom..

[5] S. Shankar Sastry,et al. Reachability calculations for automated aerial refueling , 2008, 2008 47th IEEE Conference on Decision and Control.

[6] P. Varaiya,et al. Differential games , 1971 .

[7] Josef Shinar,et al. Robust trajectory tracking: differential game/cheap control approach , 2014, Int. J. Syst. Sci..

[8] David Q. Mayne,et al. Differential dynamic programming , 1972, The Mathematical Gazette.

[9] Pravin Varaiya,et al. Ellipsoidal Techniques for Reachability Analysis , 2000, HSCC.

[10] G. Feichtinger,et al. Tractable classes of nonzero-sum open-loop Nash differential games: Theory and examples , 1985 .

[11] Zhengyuan Zhou,et al. Evasion of a team of dubins vehicles from a hidden pursuer , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[12] Y. Ho,et al. Nonzero-sum differential games , 1969 .

[13] John Lygeros,et al. Hamilton–Jacobi Formulation for Reach–Avoid Differential Games , 2009, IEEE Transactions on Automatic Control.

[14] T. Başar,et al. Dynamic Noncooperative Game Theory , 1982 .

[15] S. Shankar Sastry,et al. On Finding Local Nash Equilibria (and Only Local Nash Equilibria) in Zero-Sum Games , 2019, 1901.00838.

[16] Zijian Wang,et al. Game Theoretic Motion Planning for Multi-robot Racing , 2018, DARS.

[17] Alexandre M. Bayen,et al. A time-dependent Hamilton-Jacobi formulation of reachable sets for continuous dynamic games , 2005, IEEE Transactions on Automatic Control.

[18] P. Souganidis,et al. Differential Games and Representation Formulas for Solutions of Hamilton-Jacobi-Isaacs Equations. , 1983 .

[19] Wei Zhan,et al. Constrained iterative LQR for on-road autonomous driving motion planning , 2017, 2017 IEEE 20th International Conference on Intelligent Transportation Systems (ITSC).

[20] Heinz Schättler,et al. Sequential Linear Quadratic Method for Differential Games , 2000 .

[21] P. Souganidis,et al. Differential Games, Optimal Control and Directional Derivatives of Viscosity Solutions of Bellman’s and Isaacs’ Equations , 1985 .

[22] Anca D. Dragan,et al. Hierarchical Game-Theoretic Planning for Autonomous Vehicles , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[23] Yuval Tassa,et al. Control-limited differential dynamic programming , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[24] P. Varaiya,et al. Ellipsoidal techniques for reachability analysis: internal approximation , 2000 .

[25] Y. Ho,et al. Further properties of nonzero-sum differential games , 1969 .

[26] Mo Chen,et al. Safe sequential path planning of multi-vehicle systems via double-obstacle Hamilton-Jacobi-Isaacs variational inequality , 2014, 2015 European Control Conference (ECC).

[27] Siddhartha S. Srinivasa,et al. Planning-based prediction for pedestrians , 2009, 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[28] C. Desoer,et al. Linear System Theory , 1963 .

[29] Jaime F. Fisac,et al. The pursuit-evasion-defense differential game in dynamic constrained environments , 2015, 2015 54th IEEE Conference on Decision and Control (CDC).

[30] Olivier Stasse,et al. Whole-body model-predictive control applied to the HRP-2 humanoid , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[31] Ian M. Mitchell,et al. Reachability Analysis Using Polygonal Projections , 1999, HSCC.

[32] R. Isaacs. Differential games: a mathematical theory with applications to warfare and pursuit , 1999 .

[33] Emanuel Todorov,et al. Iterative Linear Quadratic Regulator Design for Nonlinear Biological Movement Systems , 2004, ICINCO.

[34] Mo Chen,et al. Decomposition of Reachable Sets and Tubes for a Class of Nonlinear Systems , 2016, IEEE Transactions on Automatic Control.

[35] David Hsu,et al. Intention-aware online POMDP planning for autonomous driving in a crowd , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[36] Zhengyuan Zhou,et al. A general, open-loop formulation for reach-avoid games , 2012, 2012 IEEE 51st IEEE Conference on Decision and Control (CDC).

[37] Marco Pavone,et al. Multimodal Probabilistic Model-Based Planning for Human-Robot Interaction , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[38] Pravin Varaiya,et al. On Ellipsoidal Techniques for Reachability Analysis. Part II: Internal Approximations Box-valued Constraints , 2002, Optim. Methods Softw..

[39] David J. N. Limebeer,et al. Linear Robust Control , 1994 .

[40] Mo Chen,et al. FaSTrack: A modular framework for fast and guaranteed safe motion planning , 2017, 2017 IEEE 56th Annual Conference on Decision and Control (CDC).

[41] D. Mayne. A Second-order Gradient Method for Determining Optimal Trajectories of Non-linear Discrete-time Systems , 1966 .

[42] Pieter Abbeel,et al. Physics-based trajectory optimization for grasping in cluttered environments , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[43] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.

[44] Jur P. van den Berg,et al. Iterated LQR smoothing for locally-optimal feedback control of systems with non-linear dynamics and non-quadratic cost , 2014, 2014 American Control Conference.

[45] E. Todorov,et al. A generalized iterative LQG method for locally-optimal feedback control of constrained nonlinear stochastic systems , 2005, Proceedings of the 2005, American Control Conference, 2005..

[46] Sean R Eddy,et al. What is dynamic programming? , 2004, Nature Biotechnology.

[47] Mingyu Wang,et al. Game Theoretic Planning for Self-Driving Cars in Competitive Scenarios , 2019, Robotics: Science and Systems.

[48] Min Xu,et al. Local Convergence of the Sequential Quadratic Method for Differential Games , 2012 .

[49] Ian M. Mitchell,et al. Lagrangian methods for approximating the viability kernel in high-dimensional systems , 2013, Autom..