Reinforcement distribution in fuzzy Q-learning
暂无分享,去创建一个
Andrea Bonarini | Alessandro Lazaric | Marcello Restelli | Francesco Montrone | A. Lazaric | Andrea Bonarini | Marcello Restelli | F. Montrone
[1] Richard Bellman,et al. Decision-making in fuzzy environment , 2012 .
[2] M. J. D. Powell,et al. Radial basis functions for multivariable interpolation: a review , 1987 .
[3] David S. Broomhead,et al. Multivariable Functional Interpolation and Adaptive Networks , 1988, Complex Syst..
[4] D. Broomhead,et al. Radial Basis Functions, Multi-Variable Functional Interpolation and Adaptive Networks , 1988 .
[5] R.J. Williams,et al. Reinforcement learning is direct adaptive optimal control , 1991, IEEE Control Systems.
[6] Geoffrey E. Hinton,et al. Feudal Reinforcement Learning , 1992, NIPS.
[7] Hamid R. Berenji,et al. A reinforcement learning--based architecture for fuzzy logic control , 1992, Int. J. Approx. Reason..
[8] C. Anderson,et al. Multigrid Q-learning , 1994 .
[9] Andrew W. Moore,et al. Generalization in Reinforcement Learning: Safely Approximating the Value Function , 1994, NIPS.
[10] Michael I. Jordan,et al. MASSACHUSETTS INSTITUTE OF TECHNOLOGY ARTIFICIAL INTELLIGENCE LABORATORY and CENTER FOR BIOLOGICAL AND COMPUTATIONAL LEARNING DEPARTMENT OF BRAIN AND COGNITIVE SCIENCES , 1996 .
[11] Michael I. Jordan,et al. Reinforcement Learning with Soft State Aggregation , 1994, NIPS.
[12] Gerald Tesauro,et al. Temporal Difference Learning and TD-Gammon , 1995, J. Int. Comput. Games Assoc..
[13] Hyung Suck Cho,et al. A sensor-based navigation for a mobile robot using fuzzy logic and reinforcement learning , 1995, IEEE Trans. Syst. Man Cybern..
[14] Ben J. A. Kröse,et al. Learning from delayed rewards , 1995, Robotics Auton. Syst..
[15] Richard S. Sutton,et al. Generalization in ReinforcementLearning : Successful Examples UsingSparse Coarse , 1996 .
[16] Andrew G. Barto,et al. Learning to Act Using Real-Time Dynamic Programming , 1995, Artif. Intell..
[17] Leemon C. Baird,et al. Residual Algorithms: Reinforcement Learning with Function Approximation , 1995, ICML.
[18] 김재현,et al. Fuzzy-Q learning , 1996 .
[19] T. Horiuchi,et al. Fuzzy interpolation-based Q-learning with continuous states and actions , 1996, Proceedings of IEEE 5th International Fuzzy Systems.
[20] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[21] Thomas G. Dietterich. What is machine learning? , 2020, Archives of Disease in Childhood.
[22] Andrea Bonarini. Delayed Reinforcement , Fuzzy Q-Learning and Fuzzy Logic Controllers , 1996 .
[23] John N. Tsitsiklis,et al. Analysis of temporal-difference learning with function approximation , 1996, NIPS 1996.
[24] Janusz Kacprzyk,et al. Multistage Fuzzy Control: A Prescriptive Approach , 1997 .
[25] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[26] Kyung-Whan Oh,et al. A fuzzy reinforcement function for the intelligent agent to process vague goals , 2000, PeachFuzz 2000. 19th International Conference of the North American Fuzzy Information Processing Society - NAFIPS (Cat. No.00TH8500).
[27] Geoffrey J. Gordon. Reinforcement Learning with Function Approximation Converges to a Region , 2000, NIPS.
[28] Andrea Bonarini. Evolutionary learning, reinforcement learning, and fuzzy rules for knowledge acquisition in agent-based systems , 2001 .
[29] Stuart I. Reynolds. Reinforcement Learning with Exploration , 2002 .
[30] Meng Joo Er,et al. Automatic generation of fuzzy inference systems by dynamic fuzzy Q-learning , 2003, SMC'03 Conference Proceedings. 2003 IEEE International Conference on Systems, Man and Cybernetics. Conference Theme - System Security and Assurance (Cat. No.03CH37483).
[31] Meng Joo Er,et al. Efficient implementation of dynamic fuzzy Q-learning , 2003, Fourth International Conference on Information, Communications and Signal Processing, 2003 and the Fourth Pacific Rim Conference on Multimedia. Proceedings of the 2003 Joint.
[32] Dongbing Gu,et al. Learning fuzzy logic controller for reactive robot behaviours , 2003, Proceedings 2003 IEEE/ASME International Conference on Advanced Intelligent Mechatronics (AIM 2003).
[33] Chi-Kwong Li,et al. An approach to tune fuzzy controllers based on reinforcement learning , 2003, The 12th IEEE International Conference on Fuzzy Systems, 2003. FUZZ '03..
[34] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[35] José del R. Millán,et al. Continuous-Action Q-Learning , 2002, Machine Learning.
[36] Peter Dayan,et al. The convergence of TD(λ) for general λ , 1992, Machine Learning.
[37] Meng Joo Er,et al. Online tuning of fuzzy inference systems using dynamic fuzzy Q-learning , 2004, IEEE Trans. Syst. Man Cybern. Part B.
[38] Terrence J. Sejnowski,et al. TD(λ) Converges with Probability 1 , 1994, Machine Learning.
[39] Dongbing Gu,et al. Accuracy based fuzzy Q-learning for robot behaviours , 2004, 2004 IEEE International Conference on Fuzzy Systems (IEEE Cat. No.04CH37542).
[40] Peter Dayan,et al. Technical Note: Q-Learning , 2004, Machine Learning.
[41] Andrej Dobnikar,et al. Adaptive Radial Basis Decomposition by Learning Vector Quantization , 2003, Neural Processing Letters.
[42] Peter Stone,et al. Function Approximation via Tile Coding: Automating Parameter Choice , 2005, SARA.
[43] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[44] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 1988, Machine Learning.
[45] Bohdana Ratitch. On characteristics of markov decision processes and reinforcement learning in large domains , 2005 .
[46] H. Robbins. A Stochastic Approximation Method , 1951 .
[47] Dimitris C. Dracopoulos. Evolutionary Learning , 2008, Wiley Encyclopedia of Computer Science and Engineering.
[48] P. Schrimpf,et al. Dynamic Programming , 2011 .