Adaptive Exploration Using Stochastic Neurons
暂无分享,去创建一个
[1] Marco Wiering,et al. Explorations in efficient reinforcement learning , 1999 .
[2] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[3] Günther Palm,et al. Value-Difference Based Exploration: Adaptive Control between Epsilon-Greedy and Softmax , 2011, KI.
[4] Nees Jan van Eck,et al. Application of reinforcement learning to the game of Othello , 2008, Comput. Oper. Res..
[5] Peter Auer,et al. Using Confidence Bounds for Exploitation-Exploration Trade-offs , 2003, J. Mach. Learn. Res..
[6] Mahesan Niranjan,et al. On-line Q-learning using connectionist systems , 1994 .
[7] Friedhelm Schwenker,et al. Learning a Strategy with Neural Approximated Temporal-Difference Methods in English Draughts , 2010, 2010 20th International Conference on Pattern Recognition.
[8] Sebastian Thrun,et al. Efficient Exploration In Reinforcement Learning , 1992 .
[9] Chris Watkins,et al. Learning from delayed rewards , 1989 .
[10] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[11] R. J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[12] Stefan Edelkamp,et al. KI 2011: Advances in Artificial Intelligence , 2011, Lecture Notes in Computer Science.
[13] Daniel Kudenko,et al. Online learning of shaping rewards in reinforcement learning , 2010, Neural Networks.
[14] P. Dayan,et al. Cortical substrates for exploratory decisions in humans , 2006, Nature.