Actor-Critic-Identifier Structure-Based Decentralized Neuro-Optimal Control of Modular Robot Manipulators With Environmental Collisions

This paper presents a decentralized zero-sum optimal control method for MRMs with environmental collisions via an actor-critic-identifier (ACI) structure-based adaptive dynamic programming (ADP) algorithm. The dynamic model of the MRMs is formulated via a novel collision identification method that is deployed for each joint module, in which the local position and torque information are used to design the model compensation controller. A neural network (NN) identifier is developed to compensate the model uncertainties and then, the optimal control problem of the MRMs with environmental collisions can be transformed into a two-player zero-sum optimal control one. Based on the ADP algorithm, the Hamilton-Jacobi-Isaacs (HJI) equation is solved by constructing the actor-critic NNs, thus making the derivation of the approximate optimal control policy feasible. Based on the Lyapunov theory, the closed-loop robotic system is proved to be asymptotically stable. Finally, the experiments are conducted to verify the effectiveness and advantages of the proposed method.

[1]  Derong Liu,et al.  Decentralized Stabilization for a Class of Continuous-Time Nonlinear Interconnected Systems Using Online Learning Optimal Control Approach , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[2]  Robert Babuska,et al.  Decentralized Reinforcement Learning of Robot Behaviors , 2018, Artif. Intell..

[3]  Derong Liu,et al.  Adaptive Constrained Optimal Control Design for Data-Based Nonlinear Discrete-Time Systems With Critic-Only Structure , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[4]  Xiong Yang,et al.  Reinforcement learning for adaptive optimal control of unknown continuous-time nonlinear systems with input constraints , 2014, Int. J. Control.

[5]  Alessandro De Luca,et al.  Estimation of contact forces using a virtual force sensor , 2014, 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[6]  Guangjun Liu,et al.  Distributed control of modular and reconfigurable robot with torque sensing , 2008, Robotica.

[7]  Derong Liu,et al.  Data-Driven Neuro-Optimal Temperature Control of Water–Gas Shift Reaction Using Stable Iterative Adaptive Dynamic Programming , 2014, IEEE Transactions on Industrial Electronics.

[8]  Yuanchun Li,et al.  Adaptive dynamic programming-based stabilization of nonlinear systems with unknown actuator saturation , 2018 .

[9]  Jingliang Sun,et al.  Backstepping-based zero-sum differential games for missile-target interception systems with input and output constraints , 2018 .

[10]  Warren E. Dixon,et al.  Nonlinear Control of Engineering Systems , 2002 .

[11]  Alessandro De Luca,et al.  Sensorless Robot Collision Detection and Hybrid Force/Motion Control , 2005, Proceedings of the 2005 IEEE International Conference on Robotics and Automation.

[12]  Derong Liu,et al.  Data-driven Nonlinear Near-optimal Regulation Based on Iterative Neural Dynamic Programming , 2017 .

[13]  Huaguang Zhang,et al.  Neural-Network-Based Near-Optimal Control for a Class of Discrete-Time Affine Nonlinear Systems With Control Constraints , 2009, IEEE Transactions on Neural Networks.

[14]  Qinglai Wei,et al.  Data-Driven Zero-Sum Neuro-Optimal Control for a Class of Continuous-Time Unknown Nonlinear Systems With Disturbance Using ADP , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[15]  Tingwen Huang,et al.  Data-based approximate policy iteration for affine nonlinear continuous-time optimal control design , 2014, Autom..

[16]  Bo Dong,et al.  Torque sensorless decentralized neuro-optimal control for modular and reconfigurable robots with uncertain environments , 2017, Neurocomputing.

[17]  Ning Cai,et al.  Adaptive Guaranteed-Performance Consensus Control for Multiagent Systems With an Adjustable Convergence Speed , 2019, Discrete Dynamics in Nature and Society.

[18]  Yan-Jun Liu,et al.  ADP-Based Online Tracking Control of Partially Uncertain Time-Delayed Nonlinear System and Application to Wheeled Mobile Robots , 2020, IEEE Transactions on Cybernetics.

[19]  Giulio Sandini,et al.  Force feedback exploiting tactile and proximal force/torque sensing , 2012, Autonomous Robots.

[20]  Andrew A. Goldenberg,et al.  Precise slow motion control of a direct-drive robot arm with velocity estimation and friction compensation , 2004 .

[21]  Changyin Sun,et al.  Neural-Learning-Based Control for a Constrained Robotic Manipulator With Flexible Joints , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[22]  Derong Liu,et al.  Adaptive Dynamic Programming for Discrete-Time Zero-Sum Games , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[23]  Jun-ichi Imura,et al.  Robust Control of Robot Manipulators Based on Joint Torque Sensor Information , 1991, Proceedings IROS '91:IEEE/RSJ International Workshop on Intelligent Robots and Systems '91.

[24]  Frank L. Lewis,et al.  Discrete-Time Nonlinear HJB Solution Using Approximate Dynamic Programming: Convergence Proof , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[25]  Bo Dong,et al.  Decentralized robust optimal control for modular robot manipulators via critic-identifier structure-based adaptive dynamic programming , 2018, Neural Computing and Applications.

[26]  Qinglai Wei,et al.  Discrete-Time Stable Generalized Self-Learning Optimal Control With Approximation Errors , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[27]  Warren E. Dixon,et al.  Asymptotic Tracking for Uncertain Dynamic Systems Via a Multilayer Neural Network Feedforward and RISE Feedback Control Structure , 2008, IEEE Transactions on Automatic Control.

[28]  Derong Liu,et al.  Policy Iteration Algorithm for Online Design of Robust Control for a Class of Continuous-Time Nonlinear Systems , 2014, IEEE Transactions on Automation Science and Engineering.

[29]  Na Dong,et al.  A novel ADP based model-free predictive control , 2012 .

[30]  Russ Tedrake,et al.  Localizing external contact using proprioceptive sensors: The Contact Particle Filter , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[31]  Chi Zhang,et al.  Trajectory tracking control for rotary steerable systems using interval type-2 fuzzy logic and reinforcement learning , 2017, J. Frankl. Inst..

[32]  Tamer Başar,et al.  H1-Optimal Control and Related Minimax Design Problems , 1995 .

[33]  Derong Liu,et al.  An Approximate Optimal Control Approach for Robust Stabilization of a Class of Discrete-Time Nonlinear Systems With Uncertainties , 2016, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[34]  Guangjun Liu,et al.  Design, Analysis, and Control of a Spring-Assisted Modular and Reconfigurable Robot , 2011, IEEE/ASME Transactions on Mechatronics.

[35]  Carlos Canudas de Wit,et al.  A survey of models, analysis tools and compensation methods for the control of machines with friction , 1994, Autom..

[36]  Dongbin Zhao,et al.  Event-Based Robust Control for Uncertain Nonlinear Systems Using Adaptive Dynamic Programming. , 2018, IEEE transactions on neural networks and learning systems.

[37]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[38]  Yuichiro Yoshikawa,et al.  Intrinsically motivated reinforcement learning for human-robot interaction in the real-world , 2018, Neural Networks.

[39]  Guangjun Liu,et al.  Torque Estimation for Robotic Joint With Harmonic Drive Transmission Based on Position Measurements , 2015, IEEE Transactions on Robotics.

[40]  Bo Zhao,et al.  Decentralized Control for Large-Scale Nonlinear Systems With Unknown Mismatched Interconnections via Policy Iteration , 2018, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[41]  Wei He,et al.  Adaptive Fuzzy Neural Network Control for a Constrained Robot Using Impedance Learning , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[42]  Zhong-Ping Jiang,et al.  Robust Adaptive Dynamic Programming for Large-Scale Systems With an Application to Multimachine Power Systems , 2012, IEEE Transactions on Circuits and Systems II: Express Briefs.

[43]  Chong Lin,et al.  Fast Consensus Seeking on Networks with Antagonistic Interactions , 2018, Complex..

[44]  Frank L. Lewis,et al.  A novel actor-critic-identifier architecture for approximate optimal control of uncertain nonlinear systems , 2013, Autom..

[45]  Derong Liu,et al.  Data-Based Optimal Control for Weakly Coupled Nonlinear Systems Using Policy Iteration , 2018, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[46]  Ming He,et al.  On Almost Controllability of Dynamical Complex Networks with Noises , 2017, J. Syst. Sci. Complex..

[47]  Alessandro De Luca,et al.  Robot Collisions: A Survey on Detection, Isolation, and Identification , 2017, IEEE Transactions on Robotics.

[48]  Derong Liu,et al.  Online fault compensation control based on policy iteration algorithm for a class of affine non-linear systems with actuator failures , 2016 .

[49]  Bo Dong,et al.  Decentralized Control of Harmonic Drive Based Modular Robot Manipulator using only Position Measurements: Theory and Experimental Verification , 2017, J. Intell. Robotic Syst..

[50]  Alessandro De Luca,et al.  Collision Detection and Safe Reaction with the DLR-III Lightweight Manipulator Arm , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[51]  Paul J. Werbos,et al.  Approximate dynamic programming for real-time control and neural modeling , 1992 .

[52]  Shaocheng Tong,et al.  Observer-Based Adaptive Fuzzy Decentralized Optimal Control Design for Strict-Feedback Nonlinear Large-Scale Systems , 2018, IEEE Transactions on Fuzzy Systems.

[53]  Zhong Wang,et al.  Dynamic Output Feedback Guaranteed-Cost Synchronization for Multiagent Networks With Given Cost Budgets , 2018, IEEE Access.

[54]  Guangjun Liu,et al.  Modeling of Torsional Compliance and Hysteresis Behaviors in Harmonic Drives , 2015, IEEE/ASME Transactions on Mechatronics.

[55]  Francesco Braghin,et al.  Iterative Learning Procedure With Reinforcement for High-Accuracy Force Tracking in Robotized Tasks , 2018, IEEE Transactions on Industrial Informatics.