The residual gradient FACL algorithm for differential games

A new fuzzy reinforcement learning algorithm that tunes the input and the output parameters of a fuzzy logic controller is proposed in this paper. The proposed algorithm uses three fuzzy inference systems (FISs); one is used as an actor (fuzzy logic controller, FLC), and the other two FISs are used as critics. The proposed algorithm uses the residual gradient value iteration algorithm described in [4] to tune the input and the output parameters of the actor (FLC) of the learning robot. The proposed algorithm also tunes the input and the output parameters of the critics. The proposed algorithm is called the residual gradient fuzzy actor critics learning (RGFACL) algorithm. The proposed algorithm is used to learn a single pursuit-evasion differential game. Simulation results show that the performance of the proposed RGFACL algorithm outperforms the performance of the fuzzy actor critic learning (FACL) and the Q-learning fuzzy inference system (QLFIS) algorithms proposed in [3] and [7], respectively, in terms of convergence and speed of learning.

[1]  Howard M. Schwartz,et al.  Q(λ)‐learning adaptive fuzzy logic controllers for pursuit–evasion differential games , 2011 .

[2]  Chi-Kwong Li,et al.  An approach to tune fuzzy controllers based on reinforcement learning for autonomous vehicle control , 2005, IEEE Transactions on Intelligent Transportation Systems.

[3]  Hugh F. Durrant-Whyte,et al.  A time-optimal control strategy for pursuit-evasion games problems , 2004, IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004.

[4]  Rufus Isaacs,et al.  Differential Games , 1965 .

[5]  Robert Babuska,et al.  Adaptive fuzzy control of satellite attitude by reinforcement learning , 1998, IEEE Trans. Fuzzy Syst..

[6]  Lionel Jouffe,et al.  Fuzzy inference system learning by reinforcement methods , 1998, IEEE Trans. Syst. Man Cybern. Part C.

[7]  Howard M. Schwartz,et al.  A novel technique to design a fuzzy logic controller using Q(λ)-learning and genetic algorithms in the pursuit-evasion game , 2009, 2009 IEEE International Conference on Systems, Man and Cybernetics.

[8]  Ebrahim H. Mamdani,et al.  An Experiment in Linguistic Synthesis with a Fuzzy Logic Controller , 1999, Int. J. Hum. Comput. Stud..

[9]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[10]  Toshiyuki Kondo,et al.  A reinforcement learning with evolutionary state recruitment strategy for autonomous mobile robots control , 2003, Robotics Auton. Syst..

[11]  Leemon C. Baird,et al.  Residual Algorithms: Reinforcement Learning with Function Approximation , 1995, ICML.

[12]  Michio Sugeno,et al.  Fuzzy identification of systems and its applications to modeling and control , 1985, IEEE Transactions on Systems, Man, and Cybernetics.

[13]  Senén Barro,et al.  Autonomous and fast robot learning through motivation , 2007, Robotics Auton. Syst..

[14]  Leslie Pack Kaelbling,et al.  Effective reinforcement learning for mobile robots , 2002, Proceedings 2002 IEEE International Conference on Robotics and Automation (Cat. No.02CH37292).

[15]  Howard M. Schwartz,et al.  Self-learning fuzzy logic controllers for pursuit-evasion differential games , 2011, Robotics Auton. Syst..

[16]  Steven M. LaValle,et al.  Planning algorithms , 2006 .

[17]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[18]  B. Silvano Zanutto,et al.  Learning Obstacle Avoidance with an Operant Behavior Model , 2004, Artificial Life.

[19]  Li-Xin Wang,et al.  A Course In Fuzzy Systems and Control , 1996 .

[20]  N. H. C. Yung,et al.  A fuzzy controller with supervised learning assisted reinforcement learning algorithm for obstacle avoidance , 2003, IEEE Trans. Syst. Man Cybern. Part B.

[21]  Sidney Nascimento Givigi,et al.  A Reinforcement Learning Adaptive Fuzzy Controller for Differential Games , 2010, J. Intell. Robotic Syst..

[22]  Howard M. Schwartz,et al.  Hybrid intelligent systems applied to the pursuit-evasion game , 2009, 2009 IEEE International Conference on Systems, Man and Cybernetics.