LinUCB applied to Monte Carlo tree search
暂无分享,去创建一个
[1] Wei Chu,et al. Contextual Bandits with Linear Payoff Functions , 2011, AISTATS.
[2] Akihiro Kishimoto,et al. Scalable Distributed Monte-Carlo Tree Search , 2011, SOCS.
[3] Michèle Sebag,et al. The grand challenge of computer Go , 2012, Commun. ACM.
[4] David Silver,et al. Monte-Carlo tree search and rapid action value estimation in computer Go , 2011, Artif. Intell..
[5] Murray Campbell,et al. Deep Blue , 2002, Artif. Intell..
[6] Michael Buro,et al. From Simple Features to Sophisticated Evaluation Functions , 1998, Computers and Games.
[7] Sébastien Bubeck,et al. Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems , 2012, Found. Trends Mach. Learn..
[8] Thomas J. Walsh,et al. Exploring compact reinforcement-learning representations with linear regression , 2009, UAI.
[9] Dana S. Nau,et al. An Analysis of Forward Pruning , 1994, AAAI.
[10] David Silver,et al. Combining online and offline knowledge in UCT , 2007, ICML '07.
[11] Bruno Bouzy,et al. Computer Go: An AI oriented survey , 2001, Artif. Intell..
[12] Richard E. Korf,et al. Best-First Minimax Search , 1996, Artif. Intell..
[13] Michael Buro,et al. Minimum Proof Graphs and Fastest-Cut-First Search Heuristics , 2009, IJCAI.
[14] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.
[15] Tomoyuki Kaneko,et al. Large-Scale Optimization for Evaluation Functions with Minimax Search , 2014, J. Artif. Intell. Res..
[16] Gerald Tesauro,et al. Monte-Carlo simulation balancing , 2009, ICML '09.
[17] Rémi Coulom,et al. Computing "Elo Ratings" of Move Patterns in the Game of Go , 2007, J. Int. Comput. Games Assoc..
[18] Wei Chu,et al. A contextual-bandit approach to personalized news article recommendation , 2010, WWW '10.
[19] Simon M. Lucas,et al. A Survey of Monte Carlo Tree Search Methods , 2012, IEEE Transactions on Computational Intelligence and AI in Games.
[20] Bruno Bouzy,et al. Monte-Carlo Go Developments , 2003, ACG.
[21] Christopher D. Rosin,et al. Multi-armed bandits with episode context , 2011, Annals of Mathematics and Artificial Intelligence.
[22] Csaba Szepesvári,et al. Bandit Based Monte-Carlo Planning , 2006, ECML.
[23] Tomoyuki Kaneko,et al. LinUCB Applied to Monte-Carlo Tree Search , 2015, ACG.
[24] Rémi Munos,et al. Online gradient descent for least squares regression: Non-asymptotic bounds and application to bandits , 2013, ArXiv.
[25] Donald E. Knuth,et al. The Solution for the Branching Factor of the Alpha-Beta Pruning Algorithm , 1981, ICALP.