暂无分享,去创建一个
[1] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.
[2] Csaba Szepesvári,et al. Bandit Based Monte-Carlo Planning , 2006, ECML.
[3] Michael P. Wellman,et al. Stronger CDA strategies through empirical game-theoretic analysis and reinforcement learning , 2009, AAMAS.
[4] Reid G. Simmons,et al. Heuristic Search Value Iteration for POMDPs , 2004, UAI.
[5] Michael I. Jordan,et al. Learning Without State-Estimation in Partially Observable Markovian Decision Processes , 1994, ICML.
[6] Long Ji Lin,et al. Self-improving reactive agents based on reinforcement learning, planning and teaching , 1992, Machine Learning.
[7] Michael P. Wellman,et al. Scaling simulation-based game analysis through deviation-preserving reduction , 2012, AAMAS.
[8] Richard S. Sutton,et al. Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.
[9] Simon M. Lucas,et al. A Survey of Monte Carlo Tree Search Methods , 2012, IEEE Transactions on Computational Intelligence and AI in Games.
[10] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.
[11] Michael L. Littman,et al. Memoryless policies: theoretical limitations and practical results , 1994 .
[12] Long-Ji Lin,et al. Reinforcement learning for robots using neural networks , 1992 .
[13] Dhananjay K. Gode,et al. Allocative Efficiency of Markets with Zero-Intelligence Traders: Market as a Partial Substitute for Individual Rationality , 1993, Journal of Political Economy.
[14] Alex M. Andrew,et al. Reinforcement Learning: : An Introduction , 1998 .
[15] John Loch,et al. Using Eligibility Traces to Find the Best Memoryless Policy in Partially Observable Markov Decision Processes , 1998, ICML.
[16] Rajarshi Das,et al. High-performance bidding agents for the continuous double auction , 2001, EC '01.
[17] Robert Babuska,et al. Experience Replay for Real-Time Reinforcement Learning Control , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).
[18] Joel Veness,et al. Monte-Carlo Planning in Large POMDPs , 2010, NIPS.
[19] Michael P. Wellman. Methods for Empirical Game-Theoretic Analysis , 2006, AAAI.
[20] Peter Stone,et al. Function Approximation via Tile Coding: Automating Parameter Choice , 2005, SARA.
[21] J. Dickhaut,et al. Price Formation in Double Auctions , 1998 .
[22] P. Taylor,et al. Evolutionarily Stable Strategies and Game Dynamics , 1978 .