Learning against learning : evolutionary dynamics of reinforcement learning algorithms in strategic interactions
暂无分享,去创建一个
[1] B. Steele. For More Information , 2000, Journal of the National Cancer Institute.
[2] Peter Stone,et al. Convergence, Targeted Optimality, and Safety in Multiagent Learning , 2010, ICML.
[3] Peter McBurney,et al. An evolutionary game-theoretic comparison of two double-auction market designs , 2004, AAMAS'04.
[4] Josef Hofbauer,et al. Evolutionary Games and Population Dynamics , 1998 .
[5] Jan Ramon,et al. An evolutionary game-theoretic analysis of poker strategies , 2009, Entertain. Comput..
[6] Ronald A. Howard,et al. Dynamic Programming and Markov Processes , 1960 .
[7] Michael P. Wellman,et al. Nash Q-Learning for General-Sum Stochastic Games , 2003, J. Mach. Learn. Res..
[8] A. Cowles. Can Stock Market Forecasters Forecast , 1933 .
[9] R. Munos,et al. Best Arm Identification in Multi-Armed Bandits , 2010, COLT.
[10] Sean Luke,et al. Cooperative Multi-Agent Learning: The State of the Art , 2005, Autonomous Agents and Multi-Agent Systems.
[11] P. S. Sastry,et al. Varieties of learning automata: an overview , 2002, IEEE Trans. Syst. Man Cybern. Part B.
[12] Bart De Schutter,et al. A Comprehensive Survey of Multiagent Reinforcement Learning , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).
[13] Christian M. Ernst,et al. Multi-armed Bandit Allocation Indices , 1989 .
[14] Ronen I. Brafman,et al. R-MAX - A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning , 2001, J. Mach. Learn. Res..
[15] Yoav Shoham,et al. Multiagent Systems - Algorithmic, Game-Theoretic, and Logical Foundations , 2009 .
[16] Karl Tuyls,et al. A Comparative Study of Multi-agent Reinforcement Learning Dynamics , 2010 .
[17] H. Jaap van den Herik,et al. Multi-agent Learning Dynamics: A Survey , 2007, CIA.
[18] D. Cliff,et al. Zero is Not Enough: On The Lower Limit of Agent Intelligence For Continuous Double Auction Markets† , 1997 .
[19] Ali Hortaçsu,et al. Winner's Curse, Reserve Prices and Endogenous Entry: Empirical Insights from Ebay Auctions , 2003 .
[20] Karl Tuyls,et al. An Evolutionary Dynamical Analysis of Multi-Agent Learning in Iterated Games , 2005, Autonomous Agents and Multi-Agent Systems.
[21] A. Roth,et al. Predicting How People Play Games: Reinforcement Learning in Experimental Games with Unique, Mixed Strategy Equilibria , 1998 .
[22] Simon Parsons,et al. Discovering the game in auctions , 2008 .
[23] Michael L. Littman,et al. A Cognitive Hierarchy Model Applied to the Lemonade Game , 2010, Interactive Decision Theory and Game Theory.
[24] J. Huber,et al. The value of information in a multi-agent market model , 2006, physics/0610026.
[25] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[26] Peter McBurney,et al. A Novel Method for Strategy Acquisition and Its Application to a Double-Auction Market Game , 2010, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).
[27] Michael H. Bowling,et al. Bayes' Bluff: Opponent Modelling in Poker , 2005, UAI 2005.
[28] Karl Tuyls,et al. Replicator Dynamics for Multi-agent Learning: An Orthogonal Approach , 2009, ALA.
[29] Yishay Mansour,et al. Nash Convergence of Gradient Dynamics in General-Sum Games , 2000, UAI.
[30] Dave Cliff,et al. Less Than Human: Simple Adaptive Trading Agents for CDA Markets , 1998 .
[31] Y. Mansour,et al. Algorithmic Game Theory: Learning, Regret Minimization, and Equilibria , 2007 .
[32] Peter Vrancx,et al. Networks of Learning Automata and Limiting Games , 2007, Adaptive Agents and Multi-Agents Systems.
[33] Peter McBurney,et al. Evolutionary mechanism design: a review , 2010, Autonomous Agents and Multi-Agent Systems.
[34] Martin Zinkevich,et al. Online Convex Programming and Generalized Infinitesimal Gradient Ascent , 2003, ICML.
[35] Tom Lenaerts,et al. A selection-mutation model for q-learning in multi-agent systems , 2003, AAMAS '03.
[36] E. Zeeman. Dynamics of the evolution of animal conflicts , 1981 .
[37] R. McAfee,et al. Auctions and Bidding , 1986 .
[38] D. Stauffer. Life, Love and Death: Models of Biological Reproduction and Aging , 1999 .
[39] B. Malkiel. The Efficient Market Hypothesis and Its Critics , 2003 .
[40] Simon Parsons,et al. What evolutionary game theory tells us about multiagent learning , 2007, Artif. Intell..
[41] John N. Tsitsiklis,et al. Asynchronous Stochastic Approximation and Q-Learning , 1994, Machine Learning.
[42] Lihong Li,et al. PAC model-free reinforcement learning , 2006, ICML.
[43] Abraham Neyman,et al. From Markov Chains to Stochastic Games , 2003 .
[44] M. Littman,et al. Q-learning in Two-Player Two-Action Games , 2009 .
[45] J. Fox. The Myth of the Rational Market: A History of Risk, Reward, and Delusion on Wall Street , 2009 .
[46] J M Smith,et al. Evolution and the theory of games , 1976 .
[47] Rense Corten,et al. Game Theory Evolving: A Problem-Centered Introduction to Modeling Strategic Interaction (Second Edition) by Herbert Gintis , 2009, J. Artif. Soc. Soc. Simul..
[48] M. Sutter,et al. Is more information always better?: Experimental financial markets with cumulative information , 2008 .
[49] Leslie G. Valiant,et al. A theory of the learnable , 1984, STOC '84.
[50] Yoav Shoham,et al. If multi-agent learning is the answer, what is the question? , 2007, Artif. Intell..
[51] D. Cliff. Minimal-Intelligence Agents for Bargaining Behaviors in Market-Based Environments , 1997 .
[52] Victor R. Lesser,et al. A Multiagent Reinforcement Learning Algorithm with Non-linear Dynamics , 2008, J. Artif. Intell. Res..
[54] Bruce Bueno de Mesquita,et al. Game Theory, Political Economy, and the Evolving Study of War and Peace , 2006, American Political Science Review.
[55] Kumpati S. Narendra,et al. Learning Automata - A Survey , 1974, IEEE Trans. Syst. Man Cybern..
[56] H. Robbins. Some aspects of the sequential design of experiments , 1952 .
[57] Shie Mannor,et al. PAC Bounds for Multi-armed Bandit and Markov Decision Processes , 2002, COLT.
[58] P. Taylor,et al. Evolutionarily Stable Strategies and Game Dynamics , 1978 .
[59] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..
[60] G. Tesauro,et al. Analyzing Complex Strategic Interactions in Multi-Agent Systems , 2002 .
[61] R. Agrawal. Sample mean based index policies by O(log n) regret for the multi-armed bandit problem , 1995, Advances in Applied Probability.
[62] Karl Tuyls,et al. Theoretical Advantages of Lenient Learners: An Evolutionary Game Theoretic Perspective , 2008, J. Mach. Learn. Res..
[63] A. Hama. Predictably Irrational: The Hidden Forces That Shape Our Decisions , 2010 .
[64] Jonathan Schaeffer,et al. Improved Opponent Modeling in Poker , 2000 .
[65] Karl Tuyls,et al. Evolutionary Dynamics of Regret Minimization , 2010, ECML/PKDD.
[66] Manuela M. Veloso,et al. Multiagent learning using a variable learning rate , 2002, Artif. Intell..
[67] Gerhard Weiß,et al. Distributed reinforcement learning , 1995, Robotics Auton. Syst..
[68] Howie Choset,et al. Coverage for robotics – A survey of recent results , 2001, Annals of Mathematics and Artificial Intelligence.
[69] J. Cross. A Stochastic Learning Model of Economic Behavior , 1973 .
[70] Michael Kirchler,et al. Partial knowledge is a dangerous thing - On the value of asymmetric fundamental information in asset markets , 2010 .
[71] Leigh Tesfatsion,et al. Market power and efficiency in a computational electricity market with discriminatory double-auction pricing , 2001, IEEE Trans. Evol. Comput..
[72] Karl Tuyls,et al. Frequency adjusted multi-agent Q-learning , 2010, AAMAS.
[73] Michael L. Littman,et al. Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.
[74] Peter Auer,et al. Using Confidence Bounds for Exploitation-Exploration Trade-offs , 2003, J. Mach. Learn. Res..
[75] Robert Gibbons,et al. A primer in game theory , 1992 .
[76] Sönke Albers,et al. Vickrey vs. eBay: Why Second-Price Sealed-Bid Auctions Lead to More Realistic Price-Demand Functions , 2010, Int. J. Electron. Commer..
[77] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[78] Simon Parsons,et al. Auction Analysis by Normal Form Game Approximation , 2008, 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology.
[79] R. Bellman. A Markovian Decision Process , 1957 .
[80] M. Hirsch,et al. Differential Equations, Dynamical Systems, and an Introduction to Chaos , 2003 .
[81] T. D. Schneider,et al. Evolution of biological information. , 2000, Nucleic acids research.
[82] Yoav Shoham,et al. New Criteria and a New Algorithm for Learning in Multi-Agent Systems , 2004, NIPS.
[83] Ryszard Kowalczyk,et al. Dynamic analysis of multiagent Q-learning with ε-greedy exploration , 2009, ICML '09.
[84] M. Thathachar,et al. Networks of Learning Automata: Techniques for Online Stochastic Optimization , 2003 .
[85] Peter Auer,et al. Near-optimal Regret Bounds for Reinforcement Learning , 2008, J. Mach. Learn. Res..
[86] Michael H. Bowling,et al. Convergence and No-Regret in Multiagent Learning , 2004, NIPS.
[87] J. Bather,et al. Multi‐Armed Bandit Allocation Indices , 1990 .
[88] S. Parsons,et al. Everything you wanted to know about double auctions , but were afraid to ( bid or ) ask , 2006 .
[89] Michael L. Littman,et al. Friend-or-Foe Q-learning in General-Sum Games , 2001, ICML.
[90] William H. Sandholm,et al. Population Games And Evolutionary Dynamics , 2010, Economic learning and social evolution.
[91] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[92] Marco Dorigo,et al. Teamwork in Self-Organized Robot Colonies , 2009, IEEE Transactions on Evolutionary Computation.
[93] E. Scalas,et al. The value of information in financial markets: An agent-based simulation , 2007, 0712.2687.
[94] David Sklansky,et al. The Theory of Poker , 1999 .
[95] J. Huber,et al. `J'-shaped returns to timing advantage in access to information - Experimental evidence and a tentative explanation , 2007 .
[96] Simon Parsons,et al. A novel method for automatic strategy acquisition in N-player non-zero-sum games , 2006, AAMAS '06.
[97] Dione. Brunson. Super/System A Course in Power Poker , 1994 .
[98] Jonathan Schaeffer,et al. Approximating Game-Theoretic Optimal Strategies for Full-scale Poker , 2003, IJCAI.
[99] Nicholas R. Jennings,et al. Analysing Buyers' and Sellers' Strategic Interactions in Marketplaces: An Evolutionary Game Theoretic Approach , 2007, AMEC/TADA.
[100] Manuela M. Veloso,et al. Multiagent Systems: A Survey from a Machine Learning Perspective , 2000, Auton. Robots.
[101] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.
[102] L. Shapley,et al. Stochastic Games* , 1953, Proceedings of the National Academy of Sciences.
[103] Peter Stone,et al. Multiagent learning is not the answer. It is the question , 2007, Artif. Intell..
[104] Stephen Martin,et al. Market Power and/or Efficiency? , 1988 .
[105] R. Weber. On the Gittins Index for Multiarmed Bandits , 1992 .
[106] Tilman Börgers,et al. Learning Through Reinforcement and Replicator Dynamics , 1997 .
[107] Dov Monderer,et al. A Learning Approach to Auctions , 1998 .
[108] John Dickhaut,et al. Price Formation in Double Auctions , 2001, E-Commerce Agents.
[109] K. Tuyls,et al. Lenient Frequency Adjusted Q-learning , 2010 .