暂无分享,去创建一个
[1] Marc G. Bellemare,et al. The Reactor: A Sample-Efficient Actor-Critic Architecture , 2017, ArXiv.
[2] Marc G. Bellemare,et al. The Reactor: A fast and sample-efficient Actor-Critic agent for Reinforcement Learning , 2017, ICLR.
[3] Daan Wierstra,et al. Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.
[4] David Silver,et al. Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.
[5] Carl E. Rasmussen,et al. PILCO: A Model-Based and Data-Efficient Approach to Policy Search , 2011, ICML.
[6] Benjamin Van Roy,et al. Generalization and Exploration via Randomized Value Functions , 2014, ICML.
[7] Marc G. Bellemare,et al. The Arcade Learning Environment: An Evaluation Platform for General Agents , 2012, J. Artif. Intell. Res..
[8] Marc G. Bellemare,et al. Skip Context Tree Switching , 2014, ICML.
[9] Alex Graves,et al. DRAW: A Recurrent Neural Network For Image Generation , 2015, ICML.
[10] Pierre-Yves Oudeyer,et al. Exploration in Model-based Reinforcement Learning by Empirically Estimating Learning Progress , 2012, NIPS.
[11] Koray Kavukcuoglu,et al. Pixel Recurrent Neural Networks , 2016, ICML.
[12] Filip De Turck,et al. VIME: Variational Information Maximizing Exploration , 2016, NIPS.
[13] Michael L. Littman,et al. An analysis of model-based Interval Estimation for Markov Decision Processes , 2008, J. Comput. Syst. Sci..
[14] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.
[15] R. French. Catastrophic forgetting in connectionist networks , 1999, Trends in Cognitive Sciences.
[16] Peter Dayan,et al. Efficient Bayes-Adaptive Reinforcement Learning using Sample-Based Search , 2012, NIPS.
[17] Matthias Bethge,et al. A note on the evaluation of generative models , 2015, ICLR.
[18] Marc G. Bellemare,et al. Safe and Efficient Off-Policy Reinforcement Learning , 2016, NIPS.
[19] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.
[20] J. Schulman,et al. Variational Information Maximizing Exploration , 2016 .
[21] Richard S. Sutton,et al. Generalization in ReinforcementLearning : Successful Examples UsingSparse Coarse , 1996 .
[22] Filip De Turck,et al. #Exploration: A Study of Count-Based Exploration for Deep Reinforcement Learning , 2016, NIPS.
[23] Tom Schaul,et al. Unifying Count-Based Exploration and Intrinsic Motivation , 2016, NIPS.
[24] Peter Auer,et al. Near-optimal Regret Bounds for Reinforcement Learning , 2008, J. Mach. Learn. Res..
[25] Tom Schaul,et al. Dueling Network Architectures for Deep Reinforcement Learning , 2015, ICML.
[26] Pierre-Yves Oudeyer,et al. Intrinsic Motivation Systems for Autonomous Mental Development , 2007, IEEE Transactions on Evolutionary Computation.
[27] John N. Tsitsiklis,et al. Analysis of temporal-difference learning with function approximation , 1996, NIPS 1996.
[28] Alex Graves,et al. Conditional Image Generation with PixelCNN Decoders , 2016, NIPS.
[29] Joel Veness,et al. Context Tree Switching , 2011, 2012 Data Compression Conference.
[30] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[31] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.