ART-Based Neuro-fuzzy Modelling Applied to Reinforcement Learning

The mountain car problem is a well-known task, often used for testing reinforcement learning algorithms. It is a problem with real valued state variables, which means that some kind of function approximation is required. In this paper, three reinforcement learning architectures are compared on the mountain car problem. Comparison results are presented, indicating the potentials of the actor-only approach. The function approximation modules used are based on NeuroFAST ( Neuro- Fuzzy ART-Based Structure and Parameter Learning TSK Model). NeuroFAST is a neuro-fuzzy modelling algorithm, with well-proven function approximation capabilities, and features the functional reasoning method (the Takagi-Sugeno-Kang fuzzy model), Fuzzy ART concepts and specific techniques.

[1]  Spyros G. Tzafestas,et al.  Adaptive neuro-fuzzy modeling applied to policy gradient reinforcement learning , 2001, HERCMA.

[2]  Michio Sugeno,et al.  Fuzzy identification of systems and its applications to modeling and control , 1985, IEEE Transactions on Systems, Man, and Cybernetics.

[3]  L. C. Baird,et al.  Reinforcement learning in continuous time: advantage updating , 1994, Proceedings of 1994 IEEE International Conference on Neural Networks (ICNN'94).

[4]  Ebrahim Mamdani,et al.  Applications of fuzzy algorithms for control of a simple dynamic plant , 1974 .

[5]  Spyros G. Tzafestas,et al.  NeuroFAST: on-line neuro-fuzzy ART-based structure and parameter learning TSK model , 2001, IEEE Trans. Syst. Man Cybern. Part B.

[6]  John N. Tsitsiklis,et al.  Actor-Critic Algorithms , 1999, NIPS.

[7]  Gwilym M. Jenkins,et al.  Time series analysis, forecasting and control , 1971 .

[8]  C. S. George Lee,et al.  Neural fuzzy systems: a neuro-fuzzy synergism to intelligent systems , 1996 .

[9]  Andrew W. Moore,et al.  The parti-game algorithm for variable resolution reinforcement learning in multidimensional state-spaces , 2004, Machine Learning.

[10]  Stephen Grossberg,et al.  Fuzzy ARTMAP: A neural network architecture for incremental supervised learning of analog multidimensional maps , 1992, IEEE Trans. Neural Networks.

[11]  R. J. Williams,et al.  Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[12]  Stephen Grossberg,et al.  Fuzzy ART: Fast stable learning and categorization of analog patterns by an adaptive resonance system , 1991, Neural Networks.

[13]  Spyros G. Tzafestas,et al.  NeuroFAST: high accuracy neuro-fuzzy modeling , 2002, Proceedings 2002 IEEE International Conference on Artificial Intelligence Systems (ICAIS 2002).

[14]  Yishay Mansour,et al.  Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.

[15]  L. Glass,et al.  Oscillation and chaos in physiological control systems. , 1977, Science.

[16]  Peter L. Bartlett,et al.  Reinforcement Learning in POMDP's via Direct Gradient Ascent , 2000, ICML.

[17]  Geoffrey J. Gordon Stable Function Approximation in Dynamic Programming , 1995, ICML.

[18]  Leemon C. Baird,et al.  Residual Algorithms: Reinforcement Learning with Function Approximation , 1995, ICML.

[19]  Shigenobu Kobayashi,et al.  An Analysis of Actor/Critic Algorithms Using Eligibility Traces: Reinforcement Learning with Imperfect Value Function , 1998, ICML.

[20]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[21]  Andrew W. Moore,et al.  Generalization in Reinforcement Learning: Safely Approximating the Value Function , 1994, NIPS.