Multi-objective optimal control for a class of unknown nonlinear systems based on finite-approximation-error ADP algorithm

In this paper, an optimal control method for a class of unknown discrete-time nonlinear systems with general multi-objective performance indices is proposed. In the design of the optimal controller, only available input-output data are required instead of known system dynamics, and the data-based identifier is established with stability proof. By the weighted sum technology, the multi-objective optimal control problem is transformed into the single objective optimization. To obtain the solution of the HJB equation, the novel finite-approximation-error adaptive dynamic programming (ADP) algorithm is presented with convergence proof. The detailed theoretic analyses for the relationship of the approximation accuracy and the algorithm convergence are given. It is shown that, as convergence conditions are satisfied, the iterative performance index functions can converge to a finite neighborhood of the greatest lower bound of all performance index functions. Neural networks are used to approximate the performance index function and compute the optimal control policy, respectively, for facilitating the implementation of the iterative ADP algorithm. Finally, two simulation examples are given to illustrate the performance of the proposed method.

[1]  H. Sira-Ramírez Non-linear discrete variable structure systems in quasi-sliding mode , 1991 .

[2]  H. Weinert,et al.  Bryson, A. E./ Ho, Y.-C., Applied Optimal Control, Optimization, Estimation, and Control. New York-London-Sydney-Toronto. John Wiley & Sons. 1975. 481 S., £10.90 , 1979 .

[3]  Éva Gyurkovics,et al.  Quadratic stabilisation with H∞-norm bound of non-linear discrete-time uncertain systems with bounded control , 2003, Syst. Control. Lett..

[4]  Derong Liu,et al.  Action-dependent adaptive critic designs , 2001, IJCNN'01. International Joint Conference on Neural Networks. Proceedings (Cat. No.01CH37222).

[5]  Huaguang Zhang,et al.  Neural-Network-Based Near-Optimal Control for a Class of Discrete-Time Affine Nonlinear Systems With Control Constraints , 2009, IEEE Transactions on Neural Networks.

[6]  Eva M. Navarro-López,et al.  Local feedback passivation of nonlinear discrete-time systems through the speed-gradient algorithm , 2007, Autom..

[7]  Qinglai Wei,et al.  Optimal control of unknown nonaffine nonlinear discrete-time systems based on adaptive dynamic programming , 2012, Autom..

[8]  Huaguang Zhang,et al.  Optimal Tracking Control for a Class of Nonlinear Discrete-Time Systems With Time Delays Based on Heuristic Dynamic Programming , 2011, IEEE Transactions on Neural Networks.

[9]  Derong Liu,et al.  An iterative ϵ-optimal control scheme for a class of discrete-time nonlinear systems with unfixed initial state , 2012, Neural Networks.

[10]  Hugh H. T. Liu,et al.  A Parameter Optimization Approach to Multiple-Objective Controller Design , 2008, IEEE Transactions on Control Systems Technology.

[11]  Huaguang Zhang,et al.  Optimal control laws for time-delay systems with saturating actuators based on heuristic dynamic programming , 2010, Neurocomputing.

[12]  Huaguang Zhang,et al.  Novel Weighting-Delay-Based Stability Criteria for Recurrent Neural Networks With Time-Varying Delay , 2010, IEEE Transactions on Neural Networks.

[13]  Qiuye Sun,et al.  Adaptive dynamic programming-based optimal control of unknown nonaffine nonlinear discrete-time systems with proof of convergence , 2012, Neurocomputing.

[14]  Zhang Huaguang,et al.  Modeling, identification, and control of a class of nonlinear systems , 2001, IEEE Trans. Fuzzy Syst..

[15]  Huaguang Zhang,et al.  Model-free multiobjective approximate dynamic programming for discrete-time nonlinear systems with general performance index functions , 2009, Neurocomputing.

[16]  Dong Shen,et al.  Iterative Learning Control With Unknown Control Direction: A Novel Data-Based Approach , 2011, IEEE Transactions on Neural Networks.

[17]  Junmi Li,et al.  Adaptive NN output-feedback stabilization for a class of stochastic nonlinear strict-feedback systems. , 2009, ISA transactions.

[18]  Huaguang Zhang,et al.  The finite-horizon optimal control for a class of time-delay affine nonlinear system , 2011, Neural Computing and Applications.

[19]  Y. Haimes,et al.  The envelope approach for multiobjeetive optimization problems , 1987, IEEE Transactions on Systems, Man, and Cybernetics.

[20]  P. Fantini,et al.  A method for generating a well-distributed Pareto set in nonlinear multiobjective optimization , 2005 .

[21]  Xin Zhang,et al.  Data-Driven Robust Approximate Optimal Tracking Control for Unknown General Nonlinear Systems Using Adaptive Dynamic Programming Method , 2011, IEEE Transactions on Neural Networks.

[22]  Derong Liu,et al.  Adaptive dynamic programming with stable value iteration algorithm for discrete-time nonlinear systems , 2012, The 2012 International Joint Conference on Neural Networks (IJCNN).

[23]  Huaguang Zhang,et al.  Novel Stability Analysis for Recurrent Neural Networks With Multiple Delays via Line Integral-Type L-K Functional , 2010, IEEE Transactions on Neural Networks.

[24]  Yacov Y. Haimes,et al.  The Envelope Approach for Multiobjective Optimization Problems , 1985 .

[25]  Derong Liu,et al.  Finite-Approximation-Error-Based Optimal Control Approach for Discrete-Time Nonlinear Systems , 2013, IEEE Transactions on Cybernetics.

[26]  Christian Kirches,et al.  Efficient multiple objective optimal control of dynamic systems with integer controls , 2010 .

[27]  Derong Liu,et al.  Data-Based Controllability and Observability Analysis of Linear Discrete-Time Systems , 2011, IEEE Transactions on Neural Networks.

[28]  Tong Heng Lee,et al.  Data-Based Identification and Control of Nonlinear Systems via Piecewise Affine Approximation , 2011, IEEE Transactions on Neural Networks.

[29]  Sean R Eddy,et al.  What is dynamic programming? , 2004, Nature Biotechnology.

[30]  Huaguang Zhang,et al.  An iterative adaptive dynamic programming method for solving a class of nonlinear zero-sum differential games , 2011, Autom..

[31]  Haibo He,et al.  A three-network architecture for on-line learning and optimization based on adaptive dynamic programming , 2012, Neurocomputing.

[32]  Weisheng Chen Adaptive NN control for discrete-time pure-feedback systems with unknown control direction under amplitude and rate actuator constraints. , 2009, ISA transactions.

[33]  A. Messac,et al.  Normal Constraint Method with Guarantee of Even Representation of Complete Pareto Frontier , 2004 .

[34]  George M. Siouris,et al.  Applied Optimal Control: Optimization, Estimation, and Control , 1979, IEEE Transactions on Systems, Man, and Cybernetics.

[35]  Quan Yongbing,et al.  Modeling, Identification and Control of a Class of Nonlinear System , 2001 .

[36]  Duan Li,et al.  Adaptive differential dynamic programming for multiobjective optimal control , 2002, Autom..

[37]  Haibo He,et al.  Adaptive Learning and Control for MIMO System Based on Adaptive Dynamic Programming , 2011, IEEE Transactions on Neural Networks.

[38]  Huaguang Zhang,et al.  A Novel Infinite-Time Optimal Tracking Control Scheme for a Class of Discrete-Time Nonlinear Systems via the Greedy HDP Iteration Algorithm , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).