Discrete time approximate dynamic programming (ADP) techniques have been widely used in the recent literature to determine the optimal or near optimal control policies for nonlinear systems. However, an inherent assumption of ADP requires at least partial knowledge of the system dynamics as well as the value of the controlled plant one step ahead. In this work, a novel approach to ADP is attempted while relaxing the need of the partial knowledge of the nonlinear system. The proposed methodology entails a two part process: online system identification and offline optimal control training. First, in the identification process, a neural network (NN) is tuned online to learn the complete plant dynamics and local asymptotic stability is shown under a mild assumption that the NN functional reconstruction errors lie within a small-gain type norm bounded conic sector. Then, using only the NN system model, offline ADP is attempted resulting in a novel optimal control law. The proposed scheme does not require explicit knowledge of the system dynamics as only the learned NN model is needed. Proof of convergence is demonstrated. Simulation results verify theoretical conjecture.
[1]
Jagannathan Sarangapani,et al.
Neural Network Control of Nonlinear Discrete-Time Systems
,
2018
.
[2]
Naira Hovakimyan,et al.
Neural Network Adaptive Control for a Class of Nonlinear Uncertain Dynamical Systems With Asymptotic Stability Guarantees
,
2008,
IEEE Transactions on Neural Networks.
[3]
Frank L. Lewis,et al.
Discrete-Time Nonlinear HJB Solution Using Approximate Dynamic Programming: Convergence Proof
,
2008,
IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).
[4]
George G. Lendaris,et al.
Adaptive dynamic programming
,
2002,
IEEE Trans. Syst. Man Cybern. Part C.