Applying machine learning techniques to an imperfect information game
暂无分享,去创建一个
[1] Peter Norvig,et al. Artificial Intelligence: A Modern Approach , 1995 .
[2] Roderick J. A. Little,et al. Statistical Analysis with Missing Data , 1988 .
[3] Grgoire Montavon,et al. Neural Networks: Tricks of the Trade , 2012, Lecture Notes in Computer Science.
[4] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[5] V. Crawford. Learning the Optimal Strategy in a Zero-Sum Game , 1974 .
[6] Bret Hoehn,et al. The Effectiveness of Opponent Modelling in a Small Imperfect Information Game , 2006 .
[7] Fredrik A. Dahl,et al. The Lagging Anchor Algorithm: Reinforcement Learning in Two-Player Zero-Sum Games with Imperfect Information , 2002, Machine Learning.
[8] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[9] Klaus-Robert Müller,et al. Efficient BackProp , 2012, Neural Networks: Tricks of the Trade.
[10] Fredrik A. Dahl. The lagging anchor model for game learning—a solution to the Crawford puzzle , 2005 .
[11] D. Papp. Dealing with imperfect information in poker , 1998 .
[12] Peter Vamplew,et al. Using Stereotypes to Improve Early-Match Poker Play , 2008, Australasian Conference on Artificial Intelligence.
[13] Manuela M. Veloso,et al. Multiagent learning using a variable learning rate , 2002, Artif. Intell..
[14] H. Jaap van den Herik,et al. Progressive Strategies for Monte-Carlo Tree Search , 2008 .
[15] Terence Conrad Schauenberg,et al. Opponent Modelling and Search in Poker , 2006 .
[16] M. Spence. Job Market Signaling , 1973 .
[17] Edward J. Sondik,et al. The Optimal Control of Partially Observable Markov Processes over a Finite Horizon , 1973, Oper. Res..
[18] J. Q. Smith. Decision Analysis: A Bayesian Approach , 1988 .
[19] Kurt Driessens,et al. Bayes-Relational Learning of Opponent Models from Incomplete Information in No-Limit Poker , 2008, AAAI.
[20] Dione. Brunson. Super/System A Course in Power Poker , 1994 .
[21] Duane Szafron,et al. Using counterfactual regret minimization to create competitive multiplayer poker agents , 2010, AAMAS 2010.
[22] David Schnizlein,et al. State translation in no-limit poker , 2009 .
[23] Kevin Swingler,et al. Applying neural networks - a practical guide , 1996 .
[24] Kevin Waugh,et al. Monte Carlo Sampling for Regret Minimization in Extensive Games , 2009, NIPS.
[25] Jonathan Schaeffer,et al. Poker as a Testbed for Machine Intelligence Research , 1998 .
[26] John Aaron. Davidson,et al. Opponent modeling in poker: learning and acting in a hostile and uncertain environment , 2002 .
[27] Guillaume Chaslot,et al. Integrating Opponent Models with Monte-Carlo Tree Search in Poker , 2010, Interactive Decision Theory and Game Theory.
[28] Tommi S. Jaakkola,et al. Convergence Results for Single-Step On-Policy Reinforcement-Learning Algorithms , 2000, Machine Learning.
[29] Ken Binmore,et al. Fun and games : a text on game theory , 1992 .
[30] Nando de Freitas,et al. An Introduction to MCMC for Machine Learning , 2004, Machine Learning.
[31] Kevin Waugh,et al. Abstraction pathologies in extensive games , 2009, AAMAS.
[32] Yishay Mansour,et al. Nash Convergence of Gradient Dynamics in General-Sum Games , 2000, UAI.
[33] J. Neumann,et al. Theory of games and economic behavior , 1945, 100 Years of Math Milestones.
[34] G. McLachlan,et al. The EM algorithm and extensions , 1996 .
[35] Alex Acero,et al. Spoken Language Processing: A Guide to Theory, Algorithm and System Development , 2001 .
[36] Donald Michie,et al. Introductory Readings in Expert Systems , 1982 .
[37] Michael H. Bowling,et al. Regret Minimization in Games with Incomplete Information , 2007, NIPS.
[38] Nathan R. Sturtevant,et al. Understanding the Success of Perfect Information Monte Carlo Sampling in Game Tree Search , 2010, AAAI.
[39] Tony Jebara,et al. Machine Learning: Discriminative and Generative , 2012 .
[40] Thomas M. Cover,et al. Elements of Information Theory , 2005 .
[41] Charles M. Macal,et al. Managing Business Complexity: Discovering Strategic Solutions with Agent-Based Modeling and Simulation , 2007 .
[42] J. Bather,et al. Multi‐Armed Bandit Allocation Indices , 1990 .
[43] Ron Kohavi,et al. A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.
[44] Johannes Fürnkranz,et al. An Exploitative Monte-Carlo Poker Agent , 2009, LWA.
[45] Michael H. Bowling,et al. Bayes' Bluff: Opponent Modelling in Poker , 2005, UAI 2005.
[46] Jonathan Schaeffer,et al. The challenge of poker , 2002, Artif. Intell..
[47] Jonathan Schaeffer,et al. Approximating Game-Theoretic Optimal Strategies for Full-scale Poker , 2003, IJCAI.
[48] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.
[49] Guy Van den Broeck,et al. Monte-Carlo Tree Search in Poker Using Expected Reward Distributions , 2009, ACML.
[50] Javier Peña,et al. Smoothing Techniques for Computing Nash Equilibria of Sequential Games , 2010, Math. Oper. Res..
[51] Jonathan Schaeffer,et al. Using Probabilistic Knowledge and Simulation to Play Poker , 1999, AAAI/IAAI.
[52] Csaba Szepesvári,et al. Bandit Based Monte-Carlo Planning , 2006, ECML.
[53] Jonathan Schaeffer,et al. Game-Tree Search with Adaptation in Stochastic Imperfect-Information Games , 2004, Computers and Games.
[54] Judea Pearl,et al. Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.
[55] Andrzej S. Kozek,et al. A rule of thumb (not only) for gamblers , 1995 .
[56] Peter Bro Miltersen,et al. A near-optimal strategy for a heads-up no-limit Texas Hold'em poker tournament , 2007, AAMAS '07.
[57] Finn V. Jensen,et al. Bayesian Networks and Decision Graphs , 2001, Statistics for Engineering and Information Science.
[58] Gerald Tesauro,et al. Temporal difference learning and TD-Gammon , 1995, CACM.
[59] Tuomas Sandholm,et al. Computing an approximate jam/fold equilibrium for 3-player no-limit Texas Hold'em tournaments , 2008, AAMAS.
[60] Bill Chen,et al. The Mathematics of Poker , 2006 .
[61] Jugal K. Kalita,et al. The Significance of Temporal-Difference Learning in Self-Play Training TD-Rummy versus EVO-rummy , 2003, ICML.
[62] Michael H. Bowling,et al. Data Biased Robust Counter Strategies , 2009, AISTATS.
[63] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[64] Nicolò Cesa-Bianchi,et al. Gambling in a rigged casino: The adversarial multi-armed bandit problem , 1995, Proceedings of IEEE 36th Annual Foundations of Computer Science.
[65] Darse Billings. Algorithms and assessment in computer poker , 2006 .
[66] Darse Billings,et al. A Tool for the Direct Assessment of Poker Decisions , 2006, J. Int. Comput. Games Assoc..
[67] Tuomas Sandholm,et al. A Competitive Texas Hold'em Poker Player via Automated Abstraction and Real-Time Equilibrium Computation , 2006, AAAI.
[68] Ian D. Watson,et al. Computer poker: A review , 2011, Artif. Intell..
[69] Matthew L. Ginsberg,et al. GIB: Steps Toward an Expert-Level Bridge-Playing Program , 1999, IJCAI.
[70] Elaine Rich,et al. User Modeling via Stereotypes , 1998, Cogn. Sci..