Temporal-Logic-Based Intermittent, Optimal, and Safe Continuous-Time Learning for Trajectory Tracking

In this paper, we develop safe reinforcementlearning-based controllers for systems tasked with accomplishing complex missions that can be expressed as linear temporal logic specifications, similar to those required by search-andrescue missions. We decompose the original mission into a sequence of tracking sub-problems under safety constraints. We impose the safety conditions by utilizing barrier functions to map the constrained optimal tracking problem in the physical space to an unconstrained one in the transformed space. Furthermore, we develop policies that intermittently update the control signal to solve the tracking sub-problems with reduced burden in the communication and computation resources. Subsequently, an actor-critic algorithm is utilized to solve the underlying Hamilton-Jacobi-Bellman equations. Finally, we support our proposed framework with stability proofs and showcase its efficacy via simulation results.

[1]  Ufuk Topcu,et al.  Information-Guided Temporal Logic Inference with Prior Knowledge , 2019, 2019 American Control Conference (ACC).

[2]  Frank L. Lewis,et al.  Optimal Adaptive Control and Differential Games by Reinforcement Learning Principles , 2012 .

[3]  Wassim M. Haddad,et al.  Impulsive and Hybrid Dynamical Systems: Stability, Dissipativity, and Control , 2006 .

[4]  Hadas Kress-Gazit,et al.  Iterative Temporal Planning in Uncertain Environments With Partial Satisfaction Guarantees , 2016, IEEE Transactions on Robotics.

[5]  Warren E. Dixon,et al.  Reinforcement Learning for Optimal Feedback Control , 2018 .

[6]  Vijay Kumar,et al.  Automated composition of motion primitives for multi-robot systems from safe LTL specifications , 2014, 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[7]  Michael A. Goodrich,et al.  UAV intelligent path planning for Wilderness Search and Rescue , 2009, 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[8]  Frank L. Lewis,et al.  Optimal and Autonomous Control Using Reinforcement Learning: A Survey , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[9]  Ashish Kapoor,et al.  Safe Control under Uncertainty with Probabilistic Signal Temporal Logic , 2016, Robotics: Science and Systems.

[10]  Frank L. Lewis,et al.  Optimal tracking control of nonlinear partially-unknown constrained-input systems using integral reinforcement learning , 2014, Autom..

[11]  Antoine Girard,et al.  Dynamic Triggering Mechanisms for Event-Triggered Control , 2013, IEEE Transactions on Automatic Control.

[12]  João Pedro Hespanha,et al.  Lyapunov conditions for input-to-state stability of impulsive systems , 2008, Autom..

[13]  Kyriakos G. Vamvoudakis,et al.  Event-triggered H-infinity control for unknown continuous-time linear systems using Q-learning , 2016, 2016 IEEE 55th Conference on Decision and Control (CDC).

[14]  Paulo Tabuada,et al.  Periodic event-triggered control for nonlinear systems , 2013, 52nd IEEE Conference on Decision and Control.

[15]  Jaime F. Fisac,et al.  A General Safety Framework for Learning-Based Control in Uncertain Robotic Systems , 2017, IEEE Transactions on Automatic Control.

[16]  Warren E. Dixon,et al.  Event-Triggered Control of Multiagent Systems for Fixed and Time-Varying Network Topologies , 2017, IEEE Transactions on Automatic Control.

[17]  Yixin Yin,et al.  Safe Intermittent Reinforcement Learning With Static and Dynamic Event Generators , 2020, IEEE Transactions on Neural Networks and Learning Systems.

[18]  Manuel Mazo,et al.  Decentralized Event-Triggered Control Over Wireless Sensor/Actuator Networks , 2010, IEEE Transactions on Automatic Control.

[19]  Magnus Egerstedt,et al.  Provably-Safe Autonomous Navigation of Traffic Circles , 2019, 2019 IEEE Conference on Control Technology and Applications (CCTA).

[20]  Kurt Hornik,et al.  Universal approximation of an unknown mapping and its derivatives using multilayer feedforward networks , 1990, Neural Networks.

[21]  K. Vamvoudakis,et al.  Event‐triggered optimal tracking control of nonlinear systems , 2017 .

[22]  Paulo Tabuada,et al.  An introduction to event-triggered and self-triggered control , 2012, 2012 IEEE 51st IEEE Conference on Decision and Control (CDC).

[23]  Paulo Tabuada,et al.  Control Barrier Function Based Quadratic Programs for Safety Critical Systems , 2016, IEEE Transactions on Automatic Control.

[24]  Sandra Hirche,et al.  Active safety control for dynamic human-robot interaction , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[25]  K. Vamvoudakis Event-triggered optimal adaptive control algorithm for continuous-time nonlinear systems , 2014, IEEE/CAA Journal of Automatica Sinica.

[26]  W. P. M. H. Heemels,et al.  Output-based event-triggered control with Guaranteed ℒ∞-gain and improved event-triggering , 2010, 49th IEEE Conference on Decision and Control (CDC).

[27]  Warren E. Dixon,et al.  Sparse Learning-Based Approximate Dynamic Programming With Barrier Constraints , 2020, IEEE Control Systems Letters.

[28]  W. P. M. H. Heemels,et al.  Event-Separation Properties of Event-Triggered Control Systems , 2014, IEEE Transactions on Automatic Control.

[29]  E. Allen Emerson,et al.  Temporal and Modal Logic , 1991, Handbook of Theoretical Computer Science, Volume B: Formal Models and Sematics.

[30]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[31]  Kyriakos G. Vamvoudakis,et al.  Online, Model-Free Motion Planning in Dynamic Environments: An Intermittent, Finite Horizon Approach with Continuous-Time Q-Learning , 2020, 2020 American Control Conference (ACC).

[32]  Frank L. Lewis,et al.  Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach , 2005, Autom..

[33]  Paulo Tabuada,et al.  Event-Triggered Real-Time Scheduling of Stabilizing Control Tasks , 2007, IEEE Transactions on Automatic Control.

[34]  Kyriakos G. Vamvoudakis,et al.  Enforcing Signal Temporal Logic Specifications in Multi-Agent Adversarial Environments: A Deep Q-Learning Approach , 2018, 2018 IEEE Conference on Decision and Control (CDC).

[35]  Kyriakos G. Vamvoudakis,et al.  Continuous-Time Safe Learning with Temporal Logic Constraints in Adversarial Environments , 2020, 2020 American Control Conference (ACC).

[36]  Nina Mahmoudian,et al.  Planning Large-Scale Search and Rescue using Team of UAVs and Charging Stations* , 2018, 2018 IEEE International Symposium on Safety, Security, and Rescue Robotics (SSRR).

[37]  Ufuk Topcu,et al.  Controller Synthesis for Multi-Agent Systems With Intermittent Communication. A Metric Temporal Logic Approach , 2019, 2019 57th Annual Allerton Conference on Communication, Control, and Computing (Allerton).