Stochastic Direct Reinforcement: Application to Simple Games with Recurrence
暂无分享,去创建一个
Yufeng Liu | John Moody | Matthew Saffell | Kyoungju Youn | J. Moody | M. Saffell | Yufeng Liu | Kyoungju Youn
[1] Gerald Tesauro,et al. Extending Q-Learning to General Adaptive Multi-Agent Systems , 2003, NIPS.
[2] Michael L. Littman,et al. Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.
[3] Lizhong Wu,et al. Optimization of trading systems and portfolios , 1997, Proceedings of the IEEE/IAFE 1997 Computational Intelligence for Financial Engineering (CIFEr).
[4] Jeffrey O. Kephart,et al. Dynamic pricing by software agents , 2000, Comput. Networks.
[5] Craig Boutilier,et al. Cooperative Negotiation in Autonomic Systems using Incremental Utility Elicitation , 2002, UAI.
[6] J. Moody,et al. Performance functions and reinforcement learning for trading systems and portfolios , 1998 .
[7] Michael I. Jordan,et al. PEGASUS: A policy search method for large MDPs and POMDPs , 2000, UAI.
[8] Long Lin,et al. Memory Approaches to Reinforcement Learning in Non-Markovian Domains , 1992 .
[9] Michael P. Wellman,et al. Multiagent Reinforcement Learning: Theoretical Framework and an Algorithm , 1998, ICML.
[10] Craig Boutilier,et al. The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems , 1998, AAAI/IAAI.
[11] Matthew Saffell,et al. Learning to trade via direct reinforcement , 2001, IEEE Trans. Neural Networks.
[12] Robert H. Crites,et al. Multiagent reinforcement learning in the Iterated Prisoner's Dilemma. , 1996, Bio Systems.
[13] Jeffrey O. Kephart,et al. Pricing in Agent Economies Using Multi-Agent Q-Learning , 2002, Autonomous Agents and Multi-Agent Systems.
[14] Gerald Tesauro,et al. Strategic sequential bidding in auctions using dynamic programming , 2002, AAMAS '02.
[15] Yishay Mansour,et al. Nash Convergence of Gradient Dynamics in General-Sum Games , 2000, UAI.
[16] Manuela M. Veloso,et al. Multiagent learning using a variable learning rate , 2002, Artif. Intell..
[17] R. J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[18] Andrew McCallum,et al. Instance-Based Utile Distinctions for Reinforcement Learning , 1995 .
[19] Charles W. Anderson,et al. Approximating a Policy Can be Easier Than Approximating a Value Function , 2000 .
[20] Matthew Saffell,et al. Reinforcement Learning for Trading , 1998, NIPS.