论文信息 - Reinforcement distribution in fuzzy Q-learning

Reinforcement distribution in fuzzy Q-learning

[10] Michael I. Jordan,et al. MASSACHUSETTS INSTITUTE OF TECHNOLOGY ARTIFICIAL INTELLIGENCE LABORATORY and CENTER FOR BIOLOGICAL AND COMPUTATIONAL LEARNING DEPARTMENT OF BRAIN AND COGNITIVE SCIENCES , 1996 .

[11] Michael I. Jordan,et al. Reinforcement Learning with Soft State Aggregation , 1994, NIPS.

[12] Gerald Tesauro,et al. Temporal Difference Learning and TD-Gammon , 1995, J. Int. Comput. Games Assoc..

[13] Hyung Suck Cho,et al. A sensor-based navigation for a mobile robot using fuzzy logic and reinforcement learning , 1995, IEEE Trans. Syst. Man Cybern..

[14] Ben J. A. Kröse,et al. Learning from delayed rewards , 1995, Robotics Auton. Syst..

[15] Richard S. Sutton,et al. Generalization in ReinforcementLearning : Successful Examples UsingSparse Coarse , 1996 .

[16] Andrew G. Barto,et al. Learning to Act Using Real-Time Dynamic Programming , 1995, Artif. Intell..

[17] Leemon C. Baird,et al. Residual Algorithms: Reinforcement Learning with Function Approximation , 1995, ICML.

[18] 김재현,et al. Fuzzy-Q learning , 1996 .

[19] T. Horiuchi,et al. Fuzzy interpolation-based Q-learning with continuous states and actions , 1996, Proceedings of IEEE 5th International Fuzzy Systems.

[20] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.

[21] Thomas G. Dietterich. What is machine learning? , 2020, Archives of Disease in Childhood.

[22] Andrea Bonarini. Delayed Reinforcement , Fuzzy Q-Learning and Fuzzy Logic Controllers , 1996 .

[23] John N. Tsitsiklis,et al. Analysis of temporal-difference learning with function approximation , 1996, NIPS 1996.

[24] Janusz Kacprzyk,et al. Multistage Fuzzy Control: A Prescriptive Approach , 1997 .

[25] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .

[26] Kyung-Whan Oh,et al. A fuzzy reinforcement function for the intelligent agent to process vague goals , 2000, PeachFuzz 2000. 19th International Conference of the North American Fuzzy Information Processing Society - NAFIPS (Cat. No.00TH8500).

[27] Geoffrey J. Gordon. Reinforcement Learning with Function Approximation Converges to a Region , 2000, NIPS.

[28] Andrea Bonarini. Evolutionary learning, reinforcement learning, and fuzzy rules for knowledge acquisition in agent-based systems , 2001 .

[29] Stuart I. Reynolds. Reinforcement Learning with Exploration , 2002 .

[30] Meng Joo Er,et al. Automatic generation of fuzzy inference systems by dynamic fuzzy Q-learning , 2003, SMC'03 Conference Proceedings. 2003 IEEE International Conference on Systems, Man and Cybernetics. Conference Theme - System Security and Assurance (Cat. No.03CH37483).

[31] Meng Joo Er,et al. Efficient implementation of dynamic fuzzy Q-learning , 2003, Fourth International Conference on Information, Communications and Signal Processing, 2003 and the Fourth Pacific Rim Conference on Multimedia. Proceedings of the 2003 Joint.

[32] Dongbing Gu,et al. Learning fuzzy logic controller for reactive robot behaviours , 2003, Proceedings 2003 IEEE/ASME International Conference on Advanced Intelligent Mechatronics (AIM 2003).

[33] Chi-Kwong Li,et al. An approach to tune fuzzy controllers based on reinforcement learning , 2003, The 12th IEEE International Conference on Fuzzy Systems, 2003. FUZZ '03..

[34] Peter Dayan,et al. Q-learning , 1992, Machine Learning.

[35] José del R. Millán,et al. Continuous-Action Q-Learning , 2002, Machine Learning.

[36] Peter Dayan,et al. The convergence of TD(λ) for general λ , 1992, Machine Learning.

[37] Meng Joo Er,et al. Online tuning of fuzzy inference systems using dynamic fuzzy Q-learning , 2004, IEEE Trans. Syst. Man Cybern. Part B.

[38] Terrence J. Sejnowski,et al. TD(λ) Converges with Probability 1 , 1994, Machine Learning.

[39] Dongbing Gu,et al. Accuracy based fuzzy Q-learning for robot behaviours , 2004, 2004 IEEE International Conference on Fuzzy Systems (IEEE Cat. No.04CH37542).

[40] Peter Dayan,et al. Technical Note: Q-Learning , 2004, Machine Learning.

[41] Andrej Dobnikar,et al. Adaptive Radial Basis Decomposition by Learning Vector Quantization , 2003, Neural Processing Letters.

[42] Peter Stone,et al. Function Approximation via Tile Coding: Automating Parameter Choice , 2005, SARA.

[43] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[44] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 1988, Machine Learning.

[45] Bohdana Ratitch. On characteristics of markov decision processes and reinforcement learning in large domains , 2005 .

[46] H. Robbins. A Stochastic Approximation Method , 1951 .

[47] Dimitris C. Dracopoulos. Evolutionary Learning , 2008, Wiley Encyclopedia of Computer Science and Engineering.

[48] P. Schrimpf,et al. Dynamic Programming , 2011 .