Fuzzy Q-learning in continuous state and action space

Abstract An adaptive fuzzy Q-learning (AFQL) based on fuzzy inference systems (FIS) is proposed. The FIS realized by a normalized radial basis function (NRBF) neural network is used to approach Q-value function, whose input is composed of state and action. The rules of FIS are created incrementally according to the novelty of each element of the pair of state-action. Moreover the premise part and consequent part of the FIS are updated using extended Kalman filter (EKF). The action that impacts on environment is the one with maximum output of FIS in the current state and generated through optimization method. Simulation results in the wall-following task of mobile robots and the inverted pendulum balancing problem demonstrate that the superiority and applicability of the proposed AFQL method.

[1]  Hassan B. Kazemian,et al.  A fuzzy control scheme for video transmission in Bluetooth wireless , 2006, Inf. Sci..

[2]  Meng Joo Er,et al.  Online tuning of fuzzy inference systems using dynamic fuzzy Q-learning , 2004, IEEE Trans. Syst. Man Cybern. Part B.

[3]  Riccardo Poli,et al.  Particle swarm optimization , 1995, Swarm Intelligence.

[4]  Alexander Zelinsky,et al.  Q-Learning in Continuous State and Action Spaces , 1999, Australian Joint Conference on Artificial Intelligence.

[5]  Chuen-Chien Lee,et al.  Fuzzy logic in control systems: fuzzy logic controller. II , 1990, IEEE Trans. Syst. Man Cybern..

[6]  Sharad Singhal,et al.  Training Multilayer Perceptrons with the Extende Kalman Algorithm , 1988, NIPS.

[7]  Thomas G. Dietterich Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition , 1999, J. Artif. Intell. Res..

[8]  Doina Precup,et al.  Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..

[9]  Samuel Delepoulle,et al.  A generic architecture for adaptive agents based on reinforcement learning , 2004, Inf. Sci..

[10]  Toshiyuki Kondo,et al.  A reinforcement learning with evolutionary state recruitment strategy for autonomous mobile robots control , 2003, Robotics Auton. Syst..

[11]  Kao-Shing Hwang,et al.  Reinforcement learning to adaptive control of nonlinear systems , 2003, IEEE Trans. Syst. Man Cybern. Part B.

[12]  David Andre,et al.  Programmable Reinforcement Learning Agents , 2000, NIPS.

[13]  Chia-Feng Juang,et al.  Combination of online clustering and Q-value based GA for reinforcement fuzzy system design , 2005, IEEE Trans. Fuzzy Syst..

[14]  Long Ji Lin,et al.  Self-improving reactive agents based on reinforcement learning, planning and teaching , 1992, Machine Learning.

[15]  James S. Albus,et al.  New Approach to Manipulator Control: The Cerebellar Model Articulation Controller (CMAC)1 , 1975 .

[16]  John C. Platt A Resource-Allocating Network for Function Interpolation , 1991, Neural Computation.

[17]  T. Horiuchi,et al.  Fuzzy interpolation-based Q-learning with profit sharing plan scheme , 1997, Proceedings of 6th International Fuzzy Systems Conference.

[18]  Oscar Castillo,et al.  Intelligent control of a stepping motor drive using an adaptive neuro-fuzzy inference system , 2005, Inf. Sci..

[19]  Andrea Bonarini,et al.  Reinforcement Learning in Continuous Action Spaces through Sequential Monte Carlo Methods , 2007, NIPS.

[20]  Lionel Jouffe,et al.  Fuzzy inference system learning by reinforcement methods , 1998, IEEE Trans. Syst. Man Cybern. Part C.

[21]  Yoichiro Maeda Modified Q-learning method with fuzzy state division and adaptive rewards , 2002, 2002 IEEE World Congress on Computational Intelligence. 2002 IEEE International Conference on Fuzzy Systems. FUZZ-IEEE'02. Proceedings (Cat. No.02CH37291).

[22]  Richard S. Sutton,et al.  Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.

[23]  Richard S. Sutton,et al.  Generalization in ReinforcementLearning : Successful Examples UsingSparse Coarse , 1996 .

[24]  Chuen-Chien Lee FUZZY LOGIC CONTROL SYSTEMS: FUZZY LOGIC CONTROLLER - PART I , 1990 .