论文信息 - Modified Q-learning method with fuzzy state division and adaptive rewards

Modified Q-learning method with fuzzy state division and adaptive rewards

Reinforcement learning method can be considered as an adaptive learning method for autonomous agents. It is important to balance between searching behavior of the unknown knowledge and using behavior of the obtained knowledge. However, the learning is not always efficient in every searching stage because of constant learning parameters in the ordinary Q-learning. For this problem, we have already proposed an adaptive Q-learning method with learning parameters tuned by fuzzy rules. Furthermore, it is hard to deal with the continuous states and behaviors in the ordinary reinforcement learning method. It is also difficult to learn the problem with multiple purposes. Therefore, in this research, we propose a modified Q-learning method where the reward values are tuned according to its state and can deal with multiple purposes in the continuous state space by using fuzzy reasoning. We also report some results for the simulation of object chase agents by using this method.

Yoichiro Maeda

[1] Richard S. Sutton,et al. Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.

[2] Axel van Lamsweerde,et al. Learning machine learning , 1991 .

[3] Yoichiro Maeda. Fuzzy adaptive Q-learning method with dynamic learning parameters , 2001, Proceedings Joint 9th IFSA World Congress and 20th NAFIPS International Conference (Cat. No. 01TH8569).

[4] Andrew McCallum,et al. Using Transitional Proximity for Faster Reinforcement Learning , 1992, ML.

[5] Jing Peng,et al. Efficient Memory-Based Dynamic Programming , 1995, ICML.

[6] Satinder Singh. Transfer of Learning by Composing Solutions of Elemental Sequential Tasks , 1992, Mach. Learn..