Synergizing reinforcement learning and game theory - A new direction for control
暂无分享,去创建一个
[1] Rajneesh Sharma,et al. A Markov Game-Adaptive Fuzzy Controller for Robot Manipulators , 2008, IEEE Transactions on Fuzzy Systems.
[2] T. T. Shannon,et al. Adaptive critic based adaptation of a fuzzy policy manager for a logistic system , 2001, Proceedings Joint 9th IFSA World Congress and 20th NAFIPS International Conference (Cat. No. 01TH8569).
[3] Michael L. Littman,et al. Friend-or-Foe Q-learning in General-Sum Games , 2001, ICML.
[4] Kenji Doya,et al. Reinforcement Learning in Continuous Time and Space , 2000, Neural Computation.
[5] O. J. Vrieze,et al. Surveys in game theory and related topics , 1987 .
[6] Ranjan K. Mallik,et al. Analysis of an on-off jamming situation as a dynamic game , 2000, IEEE Trans. Commun..
[7] Pravin Varaiya,et al. Stochastic Systems: Estimation, Identification, and Adaptive Control , 1986 .
[8] Makoto Yokoo,et al. Taming Decentralized POMDPs: Towards Efficient Policy Computation for Multiagent Settings , 2003, IJCAI.
[9] Rodney A. Brooks,et al. A Robust Layered Control Syste For A Mobile Robot , 2022 .
[10] Donald A. Sofge,et al. Handbook of Intelligent Control: Neural, Fuzzy, and Adaptive Approaches , 1992 .
[11] Roberto A. Santiago,et al. Adaptive critic designs: A case study for neurocontrol , 1995, Neural Networks.
[12] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 1988, Machine Learning.
[13] Michail G. Lagoudakis,et al. Learning in Zero-Sum Team Markov Games Using Factored Value Functions , 2002, NIPS.
[14] O. J. Vrieze,et al. Stochastic Games with Finite State and Action Spaces. , 1988 .
[15] Joelle Pineau,et al. Anytime Point-Based Approximations for Large POMDPs , 2006, J. Artif. Intell. Res..
[16] Neil Immerman,et al. The Complexity of Decentralized Control of Markov Decision Processes , 2000, UAI.
[17] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[18] N.S. Patel. Lot allocation and process control in semiconductor manufacturing - a dynamic game approach , 2004, 2004 43rd IEEE Conference on Decision and Control (CDC) (IEEE Cat. No.04CH37601).
[19] Shlomo Zilberstein,et al. Dynamic Programming for Partially Observable Stochastic Games , 2004, AAAI.
[20] Leslie Pack Kaelbling,et al. Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..
[21] Richard S. Sutton,et al. Reinforcement learning with replacing eligibility traces , 2004, Machine Learning.
[22] Glenn A. Iba,et al. A Heuristic Approach to the Discovery of Macro-Operators , 1989, Machine Learning.
[23] Shlomo Zilberstein,et al. Finite-memory control of partially observable systems , 1998 .
[24] Eitan Altman,et al. Multiuser rate-based flow control , 1998, IEEE Trans. Commun..
[25] Masato Ishikawa,et al. 43rd IEEE Conference on Decision and Control , 2005 .
[26] Rajneesh Sharma,et al. A Safe and Consistent Game-theoretic Controller for Nonlinear Systems , 2005, IICAI.
[27] J. Doyle,et al. Robust and optimal control , 1995, Proceedings of 35th IEEE Conference on Decision and Control.
[28] Chuan-Kai Lin,et al. A reinforcement learning adaptive fuzzy controller for robots , 2003, Fuzzy Sets Syst..
[29] Shlomo Zilberstein,et al. Formal models and algorithms for decentralized decision making under uncertainty , 2008, Autonomous Agents and Multi-Agent Systems.
[30] Takashi Maeda,et al. On characterization of equilibrium strategy of two-person zero-sum games with fuzzy payoffs , 2003, Fuzzy Sets Syst..
[31] George G. Lendaris,et al. Dual heuristic programming for fuzzy control , 2001, Proceedings Joint 9th IFSA World Congress and 20th NAFIPS International Conference (Cat. No. 01TH8569).
[32] Michael H. Bowling,et al. Convergence and No-Regret in Multiagent Learning , 2004, NIPS.
[33] Henrik Schiøler,et al. Trophallaxis in robotic swarms - beyond energy autonomy , 2008, 2008 10th International Conference on Control, Automation, Robotics and Vision.
[34] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[35] Warren B. Powell,et al. Handbook of Learning and Approximate Dynamic Programming , 2006, IEEE Transactions on Automatic Control.
[36] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[37] Matthias Heger,et al. Consideration of Risk in Reinforcement Learning , 1994, ICML.
[38] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[39] Niket S. Kaisare,et al. Simulation based strategy for nonlinear optimal control: application to a microbial cell reactor , 2003 .
[40] Judy A. Franklin,et al. Biped dynamic walking using reinforcement learning , 1997, Robotics Auton. Syst..
[41] Bor-Sen Chen,et al. Robust tracking enhancement of robot systems including motor dynamics: a fuzzy-based dynamic game approach , 1998, IEEE Trans. Fuzzy Syst..
[42] Ilya V. Kolmanovsky,et al. Predictive energy management of a power-split hybrid electric vehicle , 2009, 2009 American Control Conference.
[43] Dimitri Jeltsema,et al. Proceedings Of The 2000 American Control Conference , 2000, Proceedings of the 2000 American Control Conference. ACC (IEEE Cat. No.00CH36334).
[44] Jun Morimoto,et al. Robust Reinforcement Learning , 2005, Neural Computation.
[45] Leslie Pack Kaelbling,et al. Associative Reinforcement Learning: Functions in k-DNF , 1994, Machine Learning.
[46] Michail G. Lagoudakis,et al. Value Function Approximation in Zero-Sum Markov Games , 2002, UAI.
[47] Michael P. Wellman,et al. Multiagent Reinforcement Learning: Theoretical Framework and an Algorithm , 1998, ICML.
[48] L. Shapley,et al. Stochastic Games* , 1953, Proceedings of the National Academy of Sciences.
[49] Nikos A. Vlassis,et al. Optimal and Approximate Q-value Functions for Decentralized POMDPs , 2008, J. Artif. Intell. Res..
[50] Rajneesh Sharma,et al. Hybrid Game Strategy in Fuzzy Markov-Game-Based Control , 2008, IEEE Transactions on Fuzzy Systems.
[51] Victor R. Lesser,et al. Planning for Weakly-Coupled Partially Observable Stochastic Games , 2005, IJCAI.
[52] Tansu Alpcan,et al. Distributed Algorithms for Nash Equilibria of Flow Control Games , 2005 .
[53] Lionel Jouffe,et al. Fuzzy inference system learning by reinforcement methods , 1998, IEEE Trans. Syst. Man Cybern. Part C.
[54] Jong Min Lee,et al. Approximate Dynamic Programming Strategies and Their Applicability for Process Control: A Review and Future Directions , 2004 .
[55] Frank L. Lewis,et al. Direct-reinforcement-adaptive-learning neural network control for nonlinear systems , 1997, Proceedings of the 1997 American Control Conference (Cat. No.97CH36041).
[56] Jeng-Yih Chiou,et al. Reinforcement learning in zero-sum Markov games for robot soccer systems , 2004, IEEE International Conference on Networking, Sensing and Control, 2004.
[57] Manuela M. Veloso,et al. Multiagent learning using a variable learning rate , 2002, Artif. Intell..
[58] Michael L. Littman,et al. Algorithms for Sequential Decision Making , 1996 .
[59] Rajneesh Sharma,et al. A robust Markov game controller for nonlinear systems , 2007, Appl. Soft Comput..
[60] Frank L. Lewis,et al. Adaptive Critic Designs for Discrete-Time Zero-Sum Games With Application to $H_{\infty}$ Control , 2007, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).
[61] J. Filar,et al. Competitive Markov Decision Processes , 1996 .
[62] Ann Nowé,et al. Evolutionary game theory and multi-agent reinforcement learning , 2005, The Knowledge Engineering Review.
[63] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..
[64] Michael L. Littman,et al. Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.
[65] Ariel Rubinstein,et al. A Course in Game Theory , 1995 .
[66] T. Başar,et al. Dynamic Noncooperative Game Theory , 1982 .
[67] J. Neumann,et al. Theory of games and economic behavior , 1945, 100 Years of Math Milestones.
[68] Michael H. Bowling,et al. Convergence Problems of General-Sum Multiagent Reinforcement Learning , 2000, ICML.
[69] Michael I. Jordan,et al. MASSACHUSETTS INSTITUTE OF TECHNOLOGY ARTIFICIAL INTELLIGENCE LABORATORY and CENTER FOR BIOLOGICAL AND COMPUTATIONAL LEARNING DEPARTMENT OF BRAIN AND COGNITIVE SCIENCES , 1996 .
[70] Peter Dayan,et al. Technical Note: Q-Learning , 2004, Machine Learning.
[71] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[72] Gregory Z. Grudic,et al. Rates of Convergence of Performance Gradient Estimates Using Function Approximation and Bias in Reinforcement Learning , 2001, NIPS.
[73] Bernard Widrow,et al. Punish/Reward: Learning with a Critic in Adaptive Threshold Systems , 1973, IEEE Trans. Syst. Man Cybern..
[74] Michael L. Littman,et al. Value-function reinforcement learning in Markov games , 2001, Cognitive Systems Research.
[75] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[76] D. Fudenberg,et al. The Theory of Learning in Games , 1998 .
[77] T. Basar,et al. H∞-0ptimal Control and Related Minimax Design Problems: A Dynamic Game Approach , 1996, IEEE Trans. Autom. Control..
[78] Anne Condon,et al. On Algorithms for Simple Stochastic Games , 1990, Advances In Computational Complexity Theory.
[79] Bart De Schutter,et al. Multi-Agent Reinforcement Learning: A Survey , 2006, 2006 9th International Conference on Control, Automation, Robotics and Vision.
[80] Hyeong Soo Chang,et al. Two-person zero-sum Markov games: receding horizon approach , 2003, IEEE Trans. Autom. Control..
[81] Madan Gopal,et al. SVM-Based Tree-Type Neural Networks as a Critic in Adaptive Critic Designs for Control , 2007, IEEE Transactions on Neural Networks.
[82] Jeff G. Schneider,et al. Approximate solutions for partially observable stochastic games with common payoffs , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..