Combination of online clustering and Q-value based GA for reinforcement fuzzy system design

This paper proposes a combination of online clustering and Q-value based genetic algorithm (GA) learning scheme for fuzzy system design (CQGAF) with reinforcements. The CQGAF fulfills GA-based fuzzy system design under reinforcement learning environment where only weak reinforcement signals such as "success" and "failure" are available. In CQGAF, there are no fuzzy rules initially. They are generated automatically. The precondition part of a fuzzy system is online constructed by an aligned clustering-based approach. By this clustering, a flexible partition is achieved. Then, the consequent part is designed by Q-value based genetic reinforcement learning. Each individual in the GA population encodes the consequent part parameters of a fuzzy system and is associated with a Q-value. The Q-value estimates the discounted cumulative reinforcement information performed by the individual and is used as a fitness value for GA evolution. At each time step, an individual is selected according to the Q-values, and then a corresponding fuzzy system is built and applied to the environment with a critic received. With this critic, Q-learning with eligibility trace is executed. After each trial, GA is performed to search for better consequent parameters based on the learned Q-values. Thus, in CQGAF, evolution is performed immediately after the end of one trial in contrast to general GA where many trials are performed before evolution. The feasibility of CQGAF is demonstrated through simulations in cart-pole balancing, magnetic levitation, and chaotic system control problems with only binary reinforcement signals.

[1]  Abdollah Homaifar,et al.  Simultaneous design of membership functions and rule sets for fuzzy controllers using genetic algorithms , 1995, IEEE Trans. Fuzzy Syst..

[2]  Long-Ji Lin,et al.  Self-improving reactive agents: case studies of reinforcement learning frameworks , 1991 .

[3]  F. Klawonn,et al.  Fuzzy Cluster Analysis: Methods for Classification, Data Analysis and Image Recognition , 1999 .

[4]  John N. Tsitsiklis,et al.  Analysis of temporal-difference learning with function approximation , 1996, NIPS 1996.

[5]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[6]  D. Adler,et al.  Genetic algorithms and simulated annealing: a marriage proposal , 1993, IEEE International Conference on Neural Networks.

[7]  Richard S. Sutton,et al.  Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.

[8]  Hung-Yuan Chung,et al.  A self-learning fuzzy logic controller using genetic algorithms with reinforcements , 1997, IEEE Trans. Fuzzy Syst..

[9]  Peter Dayan,et al.  Technical Note: Q-Learning , 2004, Machine Learning.

[10]  C. S. George Lee,et al.  Reinforcement structure/parameter learning for neural-network-based fuzzy logic control systems , 1994, IEEE Trans. Fuzzy Syst..

[11]  Chin-Teng Lin,et al.  Genetic Reinforcement Learning through Symbiotic Evolution for Fuzzy Controller Design , 2022 .

[12]  P. Y. Glorennec,et al.  Fuzzy Q-learning and dynamical fuzzy Q-learning , 1994, Proceedings of 1994 IEEE 3rd International Fuzzy Systems Conference.

[13]  T. Horiuchi,et al.  Fuzzy interpolation-based Q-learning with continuous states and actions , 1996, Proceedings of IEEE 5th International Fuzzy Systems.

[14]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[15]  Uzay Kaymak,et al.  Fuzzy clustering with volume prototypes and adaptive cluster merging , 2002, IEEE Trans. Fuzzy Syst..

[16]  L. Darrell Whitley,et al.  Using Reproductive Evaluation to Improve Genetic Search and Heuristic Discovery , 1987, International Conference on Genetic Algorithms.

[17]  Richard S. Sutton,et al.  Learning to predict by the methods of temporal differences , 1988, Machine Learning.

[18]  Lionel Jouffe,et al.  Fuzzy inference system learning by reinforcement methods , 1998, IEEE Trans. Syst. Man Cybern. Part C.

[19]  Hamid R. Berenji,et al.  Learning and tuning fuzzy logic controllers through reinforcements , 1992, IEEE Trans. Neural Networks.

[20]  Chin-Teng Lin,et al.  Controlling chaos by GA-based reinforcement learning neural network , 1999, IEEE Trans. Neural Networks.

[21]  Chris Watkins,et al.  Learning from delayed rewards , 1989 .

[22]  Vassilios Petridis,et al.  A hybrid genetic algorithm for training neural networks , 1992 .

[23]  Lampros Tsinas,et al.  A combined neural and genetic learning algorithm , 1994, Proceedings of the First IEEE Conference on Evolutionary Computation. IEEE World Congress on Computational Intelligence.

[24]  Chin-Teng Lin,et al.  GA-based fuzzy reinforcement learning for control of a magnetic bearing system , 2000, IEEE Trans. Syst. Man Cybern. Part B.

[25]  Richard S. Sutton,et al.  Reinforcement Learning , 1992, Handbook of Machine Learning.

[26]  H. R. Berenji,et al.  Fuzzy Q-learning for generalization of reinforcement learning , 1996, Proceedings of IEEE 5th International Fuzzy Systems.

[27]  Donald Gustafson,et al.  Fuzzy clustering with a fuzzy covariance matrix , 1978, 1978 IEEE Conference on Decision and Control including the 17th Symposium on Adaptive Processes.

[28]  James C. Bezdek,et al.  Pattern Recognition with Fuzzy Objective Function Algorithms , 1981, Advanced Applications in Pattern Recognition.

[29]  P. Glorennec,et al.  Fuzzy Q-learning , 1997, Proceedings of 6th International Fuzzy Systems Conference.

[30]  Richard S. Sutton,et al.  Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.

[31]  Chin-Teng Lin,et al.  An online self-constructing neural fuzzy inference network and its applications , 1998, IEEE Trans. Fuzzy Syst..

[32]  Risto Miikkulainen,et al.  Efficient Reinforcement Learning through Symbiotic Evolution , 2004 .

[33]  Darrell Whitley,et al.  Optimizing small neural networks using a distributed genetic algorithm , 1990 .

[34]  Ju-Jang Lee,et al.  Constructing a fuzzy logic controller using evolutionary Q-learning , 2000, 2000 26th Annual Conference of the IEEE Industrial Electronics Society. IECON 2000. 2000 IEEE International Conference on Industrial Electronics, Control and Instrumentation. 21st Century Technologies.

[35]  Uzay Kaymak,et al.  Similarity measures in fuzzy rule base simplification , 1998, IEEE Trans. Syst. Man Cybern. Part B.

[36]  L. Darrell Whitley,et al.  Genetic Reinforcement Learning for Neurocontrol Problems , 2004, Machine Learning.

[37]  Chin-Teng Lin,et al.  An ART-based fuzzy adaptive learning control network , 1997, IEEE Trans. Fuzzy Syst..

[38]  Andrew G. Barto,et al.  Reinforcement learning , 1998 .

[39]  C.W. Anderson,et al.  Learning to control an inverted pendulum using neural networks , 1989, IEEE Control Systems Magazine.