A robust Markov game controller for nonlinear systems

This paper proposes a reinforcement learning (RL)-based game-theoretic formulation for designing robust controllers for nonlinear systems affected by bounded external disturbances and parametric uncertainties. Based on the theory of Markov games, we consider a differential game in which a 'disturbing' agent tries to make worst possible disturbance while a 'control' agent tries to make best control input. The problem is formulated as finding a min-max solution of a value function. We propose an online procedure for learning optimal value function and for calculating a robust control policy. Proposed game-theoretic paradigm has been tested on the control task of a highly nonlinear two-link robot system. We compare the performance of proposed Markov game controller with a standard RL-based robust controller, and an H"~ theory-based robust game controller. For the robot control task, the proposed controller achieved superior robustness to changes in payload mass and external disturbances, over other control schemes. Results also validate the effectiveness of neural networks in extending the Markov game framework to problems with continuous state-action spaces.

[1]  Simon Haykin,et al.  Neural Networks: A Comprehensive Foundation , 1998 .

[2]  Chris Watkins,et al.  Learning from delayed rewards , 1989 .

[3]  Frank L. Lewis,et al.  Direct-reinforcement-adaptive-learning neural network control for nonlinear systems , 1997, Proceedings of the 1997 American Control Conference (Cat. No.97CH36041).

[4]  Michael L. Littman,et al.  Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.

[5]  Frank L. Lewis,et al.  Neural Network Control Of Robot Manipulators And Non-Linear Systems , 1998 .

[6]  Richard S. Sutton,et al.  Reinforcement Learning is Direct Adaptive Optimal Control , 1992, 1991 American Control Conference.

[7]  Eitan Altman,et al.  Zero-sum Markov games and worst-case optimal control of queueing systems , 1995, Queueing Syst. Theory Appl..

[8]  Chuan-Kai Lin,et al.  A reinforcement learning adaptive fuzzy controller for robots , 2003, Fuzzy Sets Syst..

[9]  Michail G. Lagoudakis,et al.  Least-Squares Policy Iteration , 2003, J. Mach. Learn. Res..

[10]  Jun Morimoto,et al.  Robust Reinforcement Learning , 2005, Neural Computation.

[11]  Michael L. Littman,et al.  Value-function reinforcement learning in Markov games , 2001, Cognitive Systems Research.

[12]  John N. Tsitsiklis,et al.  Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.

[13]  J. Doyle,et al.  Robust and optimal control , 1995, Proceedings of 35th IEEE Conference on Decision and Control.

[14]  D. Fudenberg,et al.  The Theory of Learning in Games , 1998 .

[15]  Michail G. Lagoudakis,et al.  Value Function Approximation in Zero-Sum Markov Games , 2002, UAI.