On Convex Data-Driven Inverse Optimal Control for Nonlinear, Non-stationary and Stochastic Systems

This paper is concerned with a finite-horizon inverse control problem, which has the goal of inferring, from observations, the possibly non-convex and non-stationary cost driving the actions of an agent. In this context, we present a result that enables cost estimation by solving an optimization problem that is convex even when the agent cost is not and when the underlying dynamics is nonlinear, non-stationary and stochastic. To obtain this result, we also study a finite-horizon forward control problem that has randomized policies as decision variables. For this problem, we give an explicit expression for the optimal solution. Moreover, we turn our findings into algorithmic procedures and we show the effectiveness of our approach via both in-silico and experimental validations with real hardware. All the experiments confirm the effectiveness of our approach.

[1]  C. D. Vecchio,et al.  Inverse Data-Driven Optimal Control for Nonlinear Stochastic Non-stationary Systems , 2023, ArXiv.

[2]  L. Rodrigues Inverse Optimal Control with Discount Factor for Continuous and Discrete-Time Control-Affine Systems and Reinforcement Learning , 2022, 2022 IEEE 61st Conference on Decision and Control (CDC).

[3]  F. Lewis,et al.  Inverse reinforcement learning for multi-player noncooperative apprentice games , 2022, Autom..

[4]  Siddharth H. Nair,et al.  Collision Avoidance for Dynamic Obstacles with Uncertain Predictions using Model Predictive Control , 2022, 2022 IEEE 61st Conference on Decision and Control (CDC).

[5]  R. Kamalapurkar,et al.  Model-based inverse reinforcement learning for deterministic systems , 2022, Autom..

[6]  Émiland Garrabé,et al.  Probabilistic design of optimal sequential decision-making algorithms in learning and control , 2022, Annu. Rev. Control..

[7]  Mac Schwager,et al.  Maximum-Entropy Multi-Agent Dynamic Games: Forward and Inverse Solutions , 2021, IEEE Transactions on Robotics.

[8]  Anders Rantzer,et al.  On Cost Design in Applications of Optimal Control , 2021, IEEE Control Systems Letters.

[9]  Dimitri Bertsekas,et al.  Multiagent Reinforcement Learning: Rollout and Policy Iteration , 2021, IEEE/CAA Journal of Automatica Sinica.

[10]  Roy S. Smith,et al.  Maximum Likelihood Estimation in Data-Driven Modeling and Control , 2020, IEEE Transactions on Automatic Control.

[11]  Mohsen Davoudi,et al.  From inverse optimal control to inverse reinforcement learning: A historical review , 2020, Annu. Rev. Control..

[12]  Yumiharu Nakano Inverse stochastic optimal controls , 2020, Autom..

[13]  Giovanni Russo,et al.  On a probabilistic approach to synthesize control policies from example datasets , 2020, Autom..

[14]  Marcelo Menezes Morato,et al.  An optimal predictive control strategy for COVID-19 (SARS-CoV-2) social distancing policies in Brazil , 2020, Annual Reviews in Control.

[15]  Siddharth Mayya,et al.  The Robotarium: Globally Impactful Opportunities, Challenges, and Lessons Learned in Remote-Access, Distributed Control of Multirobot Systems , 2020, IEEE Control Systems.

[16]  Sean P. Meyn,et al.  Kullback-Leibler-Quadratic Optimal Control of Flexible Power Demand , 2019, 2019 IEEE 58th Conference on Decision and Control (CDC).

[17]  Khac Duc Do,et al.  Inverse optimal control of stochastic systems driven by Lévy processes , 2019, Autom..

[18]  Lillian J. Ratliff,et al.  Inverse Risk-Sensitive Reinforcement Learning , 2017, IEEE Transactions on Automatic Control.

[19]  Karl Henrik Johansson,et al.  Mapless indoor localization by trajectory learning from a crowd , 2016, 2016 International Conference on Indoor Positioning and Indoor Navigation (IPIN).

[20]  Stephen P. Boyd,et al.  CVXPY: A Python-Embedded Modeling Language for Convex Optimization , 2016, J. Mach. Learn. Res..

[21]  Sergey Levine,et al.  Guided Cost Learning: Deep Inverse Optimal Control via Policy Optimization , 2016, ICML.

[22]  Rebecca Willett,et al.  Online Markov Decision Processes With Kullback–Leibler Control Cost , 2014, IEEE Transactions on Automatic Control.

[23]  Stefan Schaal,et al.  Learning objective functions for manipulation , 2013, 2013 IEEE International Conference on Robotics and Automation.

[24]  Sergey Levine,et al.  Continuous Inverse Optimal Control with Locally Optimal Examples , 2012, ICML.

[25]  Sergey Levine,et al.  Nonlinear Inverse Reinforcement Learning with Gaussian Processes , 2011, NIPS.

[26]  Emanuel Todorov,et al.  Inverse Optimal Control with Linearly-Solvable MDPs , 2010, ICML.

[27]  Emanuel Todorov,et al.  Efficient computation of optimal actions , 2009, Proceedings of the National Academy of Sciences.

[28]  Stefan Schaal,et al.  Path integral-based stochastic optimal control for rigid body dynamics , 2009, 2009 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning.

[29]  Vicenç Gómez,et al.  Optimal control as a graphical model inference problem , 2009, Machine Learning.

[30]  Anind K. Dey,et al.  Maximum Entropy Inverse Reinforcement Learning , 2008, AAAI.

[31]  Emanuel Todorov,et al.  Linearly-solvable Markov decision problems , 2006, NIPS.

[32]  Tatiana V. Guy,et al.  Fully probabilistic control design , 2006, Syst. Control. Lett..

[33]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[34]  M. Krstić,et al.  Stochastic nonlinear stabilization—II: inverse optimality , 1997 .

[35]  Huaiyu Zhu On Information and Sufficiency , 1997 .

[36]  Miroslav Kárný,et al.  Towards fully probabilistic control design , 1996, Autom..

[37]  J. A. Bryson Optimal control-1950 to 1985 , 1996 .

[38]  A. Charnes,et al.  The role of duality in optimization problems involving entropy functionals with applications to information theory , 1988 .

[39]  Mao Shan,et al.  Kalman filtering under unknown inputs and norm constraints , 2021, Autom..

[40]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[41]  Tanja Sušec,et al.  Historical Review , 1917, Acta Cytologica.