Value Function Approximation in Zero-Sum Markov Games
暂无分享,去创建一个
[1] Eitan Altman,et al. Flow control using the theory of zero sum Markov games , 1992, [1992] Proceedings of the 31st IEEE Conference on Decision and Control.
[2] Ronald J. Williams,et al. Tight Performance Bounds on Greedy Policies Based on Imperfect Value Functions , 1993 .
[3] Michael L. Littman,et al. Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.
[4] Geoffrey J. Gordon. Stable Function Approximation in Dynamic Programming , 1995, ICML.
[5] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[6] Benjamin Van Roy. Learning and value function approximation in complex decision processes , 1998 .
[7] Csaba Szepesvári,et al. A Unified Analysis of Value-Function-Based Reinforcement-Learning Algorithms , 1999, Neural Computation.
[8] Daphne Koller,et al. Policy Iteration for Factored MDPs , 2000, UAI.
[9] Manuela M. Veloso,et al. Rational and Convergent Learning in Stochastic Games , 2001, IJCAI.
[10] Michail G. Lagoudakis,et al. Model-Free Least-Squares Policy Iteration , 2001, NIPS.
[11] Nicolas Vieille,et al. Quitting Games , 2001, Math. Oper. Res..
[12] Steven J. Bradtke,et al. Linear Least-Squares algorithms for temporal difference learning , 2004, Machine Learning.