论文信息 - Actor-Critic-Identifier Structure-Based Decentralized Neuro-Optimal Control of Modular Robot Manipulators With Environmental Collisions

Actor-Critic-Identifier Structure-Based Decentralized Neuro-Optimal Control of Modular Robot Manipulators With Environmental Collisions

This paper presents a decentralized zero-sum optimal control method for MRMs with environmental collisions via an actor-critic-identifier (ACI) structure-based adaptive dynamic programming (ADP) algorithm. The dynamic model of the MRMs is formulated via a novel collision identification method that is deployed for each joint module, in which the local position and torque information are used to design the model compensation controller. A neural network (NN) identifier is developed to compensate the model uncertainties and then, the optimal control problem of the MRMs with environmental collisions can be transformed into a two-player zero-sum optimal control one. Based on the ADP algorithm, the Hamilton-Jacobi-Isaacs (HJI) equation is solved by constructing the actor-critic NNs, thus making the derivation of the approximate optimal control policy feasible. Based on the Lyapunov theory, the closed-loop robotic system is proved to be asymptotically stable. Finally, the experiments are conducted to verify the effectiveness and advantages of the proposed method.

[1] Derong Liu,et al. Decentralized Stabilization for a Class of Continuous-Time Nonlinear Interconnected Systems Using Online Learning Optimal Control Approach , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[2] Robert Babuska,et al. Decentralized Reinforcement Learning of Robot Behaviors , 2018, Artif. Intell..

[3] Derong Liu,et al. Adaptive Constrained Optimal Control Design for Data-Based Nonlinear Discrete-Time Systems With Critic-Only Structure , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[4] Xiong Yang,et al. Reinforcement learning for adaptive optimal control of unknown continuous-time nonlinear systems with input constraints , 2014, Int. J. Control.

[5] Alessandro De Luca,et al. Estimation of contact forces using a virtual force sensor , 2014, 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[6] Guangjun Liu,et al. Distributed control of modular and reconfigurable robot with torque sensing , 2008, Robotica.

[7] Derong Liu,et al. Data-Driven Neuro-Optimal Temperature Control of Water–Gas Shift Reaction Using Stable Iterative Adaptive Dynamic Programming , 2014, IEEE Transactions on Industrial Electronics.

[8] Yuanchun Li,et al. Adaptive dynamic programming-based stabilization of nonlinear systems with unknown actuator saturation , 2018 .

[9] Jingliang Sun,et al. Backstepping-based zero-sum differential games for missile-target interception systems with input and output constraints , 2018 .

[10] Warren E. Dixon,et al. Nonlinear Control of Engineering Systems , 2002 .

[11] Alessandro De Luca,et al. Sensorless Robot Collision Detection and Hybrid Force/Motion Control , 2005, Proceedings of the 2005 IEEE International Conference on Robotics and Automation.

[12] Derong Liu,et al. Data-driven Nonlinear Near-optimal Regulation Based on Iterative Neural Dynamic Programming , 2017 .

[13] Huaguang Zhang,et al. Neural-Network-Based Near-Optimal Control for a Class of Discrete-Time Affine Nonlinear Systems With Control Constraints , 2009, IEEE Transactions on Neural Networks.

[14] Qinglai Wei,et al. Data-Driven Zero-Sum Neuro-Optimal Control for a Class of Continuous-Time Unknown Nonlinear Systems With Disturbance Using ADP , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[15] Tingwen Huang,et al. Data-based approximate policy iteration for affine nonlinear continuous-time optimal control design , 2014, Autom..

[16] Bo Dong,et al. Torque sensorless decentralized neuro-optimal control for modular and reconfigurable robots with uncertain environments , 2017, Neurocomputing.

[17] Ning Cai,et al. Adaptive Guaranteed-Performance Consensus Control for Multiagent Systems With an Adjustable Convergence Speed , 2019, Discrete Dynamics in Nature and Society.

[18] Yan-Jun Liu,et al. ADP-Based Online Tracking Control of Partially Uncertain Time-Delayed Nonlinear System and Application to Wheeled Mobile Robots , 2020, IEEE Transactions on Cybernetics.

[19] Giulio Sandini,et al. Force feedback exploiting tactile and proximal force/torque sensing , 2012, Autonomous Robots.

[20] Andrew A. Goldenberg,et al. Precise slow motion control of a direct-drive robot arm with velocity estimation and friction compensation , 2004 .

[21] Changyin Sun,et al. Neural-Learning-Based Control for a Constrained Robotic Manipulator With Flexible Joints , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[22] Derong Liu,et al. Adaptive Dynamic Programming for Discrete-Time Zero-Sum Games , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[23] Jun-ichi Imura,et al. Robust Control of Robot Manipulators Based on Joint Torque Sensor Information , 1991, Proceedings IROS '91:IEEE/RSJ International Workshop on Intelligent Robots and Systems '91.

[24] Frank L. Lewis,et al. Discrete-Time Nonlinear HJB Solution Using Approximate Dynamic Programming: Convergence Proof , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[25] Bo Dong,et al. Decentralized robust optimal control for modular robot manipulators via critic-identifier structure-based adaptive dynamic programming , 2018, Neural Computing and Applications.

[26] Qinglai Wei,et al. Discrete-Time Stable Generalized Self-Learning Optimal Control With Approximation Errors , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[27] Warren E. Dixon,et al. Asymptotic Tracking for Uncertain Dynamic Systems Via a Multilayer Neural Network Feedforward and RISE Feedback Control Structure , 2008, IEEE Transactions on Automatic Control.

[28] Derong Liu,et al. Policy Iteration Algorithm for Online Design of Robust Control for a Class of Continuous-Time Nonlinear Systems , 2014, IEEE Transactions on Automation Science and Engineering.

[29] Na Dong,et al. A novel ADP based model-free predictive control , 2012 .

[30] Russ Tedrake,et al. Localizing external contact using proprioceptive sensors: The Contact Particle Filter , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[31] Chi Zhang,et al. Trajectory tracking control for rotary steerable systems using interval type-2 fuzzy logic and reinforcement learning , 2017, J. Frankl. Inst..

[32] Tamer Başar,et al. H1-Optimal Control and Related Minimax Design Problems , 1995 .

[33] Derong Liu,et al. An Approximate Optimal Control Approach for Robust Stabilization of a Class of Discrete-Time Nonlinear Systems With Uncertainties , 2016, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[34] Guangjun Liu,et al. Design, Analysis, and Control of a Spring-Assisted Modular and Reconfigurable Robot , 2011, IEEE/ASME Transactions on Mechatronics.

[35] Carlos Canudas de Wit,et al. A survey of models, analysis tools and compensation methods for the control of machines with friction , 1994, Autom..

[36] Dongbin Zhao,et al. Event-Based Robust Control for Uncertain Nonlinear Systems Using Adaptive Dynamic Programming. , 2018, IEEE transactions on neural networks and learning systems.

[37] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[38] Yuichiro Yoshikawa,et al. Intrinsically motivated reinforcement learning for human-robot interaction in the real-world , 2018, Neural Networks.

[39] Guangjun Liu,et al. Torque Estimation for Robotic Joint With Harmonic Drive Transmission Based on Position Measurements , 2015, IEEE Transactions on Robotics.

[40] Bo Zhao,et al. Decentralized Control for Large-Scale Nonlinear Systems With Unknown Mismatched Interconnections via Policy Iteration , 2018, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[41] Wei He,et al. Adaptive Fuzzy Neural Network Control for a Constrained Robot Using Impedance Learning , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[42] Zhong-Ping Jiang,et al. Robust Adaptive Dynamic Programming for Large-Scale Systems With an Application to Multimachine Power Systems , 2012, IEEE Transactions on Circuits and Systems II: Express Briefs.

[43] Chong Lin,et al. Fast Consensus Seeking on Networks with Antagonistic Interactions , 2018, Complex..

[44] Frank L. Lewis,et al. A novel actor-critic-identifier architecture for approximate optimal control of uncertain nonlinear systems , 2013, Autom..

[45] Derong Liu,et al. Data-Based Optimal Control for Weakly Coupled Nonlinear Systems Using Policy Iteration , 2018, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[46] Ming He,et al. On Almost Controllability of Dynamical Complex Networks with Noises , 2017, J. Syst. Sci. Complex..

[47] Alessandro De Luca,et al. Robot Collisions: A Survey on Detection, Isolation, and Identification , 2017, IEEE Transactions on Robotics.

[48] Derong Liu,et al. Online fault compensation control based on policy iteration algorithm for a class of affine non-linear systems with actuator failures , 2016 .

[49] Bo Dong,et al. Decentralized Control of Harmonic Drive Based Modular Robot Manipulator using only Position Measurements: Theory and Experimental Verification , 2017, J. Intell. Robotic Syst..

[50] Alessandro De Luca,et al. Collision Detection and Safe Reaction with the DLR-III Lightweight Manipulator Arm , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[51] Paul J. Werbos,et al. Approximate dynamic programming for real-time control and neural modeling , 1992 .

[52] Shaocheng Tong,et al. Observer-Based Adaptive Fuzzy Decentralized Optimal Control Design for Strict-Feedback Nonlinear Large-Scale Systems , 2018, IEEE Transactions on Fuzzy Systems.

[53] Zhong Wang,et al. Dynamic Output Feedback Guaranteed-Cost Synchronization for Multiagent Networks With Given Cost Budgets , 2018, IEEE Access.

[54] Guangjun Liu,et al. Modeling of Torsional Compliance and Hysteresis Behaviors in Harmonic Drives , 2015, IEEE/ASME Transactions on Mechatronics.

[55] Francesco Braghin,et al. Iterative Learning Procedure With Reinforcement for High-Accuracy Force Tracking in Robotized Tasks , 2018, IEEE Transactions on Industrial Informatics.