暂无分享,去创建一个
Katja Hofmann | Sebastian Tschiatschek | Kai Arulkumaran | Jan Stühmer | Katja Hofmann | Kai Arulkumaran | Jan Stühmer | Sebastian Tschiatschek
[1] Jan Peters,et al. Reinforcement learning in robotics: A survey , 2013, Int. J. Robotics Res..
[2] Romain Laroche,et al. Hybrid Reward Architecture for Reinforcement Learning , 2017, NIPS.
[3] Frans A. Oliehoek,et al. Learning in POMDPs with Monte Carlo Tree Search , 2017, ICML.
[4] Sergey Levine,et al. Deep Reinforcement Learning for Vision-Based Robotic Grasping: A Simulated Comparative Evaluation of Off-Policy Methods , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).
[5] Ole Winther,et al. Sequential Neural Models with Stochastic Layers , 2016, NIPS.
[6] D. Aberdeen,et al. A ( Revised ) Survey of Approximate Methods for Solving Partially Observable Markov Decision Processes , 2003 .
[7] Uri Shalit,et al. Deep Kalman Filters , 2015, ArXiv.
[8] Sergey Levine,et al. Learning to Run challenge: Synthesizing physiologically accurate motion using deep reinforcement learning , 2018, ArXiv.
[9] Oriol Vinyals,et al. Neural Discrete Representation Learning , 2017, NIPS.
[10] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[11] Stefan Lee,et al. Learning Cooperative Visual Dialog Agents with Deep Reinforcement Learning , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[12] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[13] Fabio Viola,et al. Learning and Querying Fast Generative Models for Reinforcement Learning , 2018, ArXiv.
[14] Noah D. Goodman,et al. Amortized Inference in Probabilistic Reasoning , 2014, CogSci.
[15] Razvan Pascanu,et al. Imagination-Augmented Agents for Deep Reinforcement Learning , 2017, NIPS.
[16] Yishay Mansour,et al. Approximate Planning in Large POMDPs via Reusable Trajectories , 1999, NIPS.
[17] Patrick M. Pilarski,et al. Temporal-Difference Learning to Assist Human Decision Making during the Control of an Artificial Limb , 2013, ArXiv.
[18] Longxin Lin. Self-Improving Reactive Agents Based on Reinforcement Learning, Planning and Teaching , 2004, Machine Learning.
[19] Csaba Szepesvári,et al. Bandit Based Monte-Carlo Planning , 2006, ECML.
[20] Catholijn M. Jonker,et al. Learning Multimodal Transition Dynamics for Model-Based Reinforcement Learning , 2017, ArXiv.
[21] Honglak Lee,et al. Action-Conditional Video Prediction using Deep Networks in Atari Games , 2015, NIPS.
[22] Fabio Viola,et al. Generative Temporal Models with Spatial Memory for Partially Observed Environments , 2018, ICML.
[23] Peter Stone,et al. Deep Recurrent Q-Learning for Partially Observable MDPs , 2015, AAAI Fall Symposia.
[24] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[25] Leslie Pack Kaelbling,et al. Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..
[26] Uri Shalit,et al. Structured Inference Networks for Nonlinear State Space Models , 2016, AAAI.
[27] Rich Sutton,et al. A Deeper Look at Planning as Learning from Replay , 2015, ICML.
[28] Joelle Pineau,et al. Bayes-Adaptive POMDPs , 2007, NIPS.
[29] Daan Wierstra,et al. Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.
[30] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.
[31] Edward J. Sondik,et al. The Optimal Control of Partially Observable Markov Processes over the Infinite Horizon: Discounted Costs , 1978, Oper. Res..
[32] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[33] Marlos C. Machado,et al. Revisiting the Arcade Learning Environment: Evaluation Protocols and Open Problems for General Agents , 2017, J. Artif. Intell. Res..
[34] Gerald Tesauro,et al. Temporal difference learning and TD-Gammon , 1995, CACM.
[35] Joel Veness,et al. Monte-Carlo Planning in Large POMDPs , 2010, NIPS.
[36] Nikos A. Vlassis,et al. Robot Planning in Partially Observable Continuous Domains , 2005, BNAIC.
[37] Michael I. Jordan,et al. Learning Without State-Estimation in Partially Observable Markovian Decision Processes , 1994, ICML.
[38] Marilyn A. Walker,et al. Reinforcement Learning for Spoken Dialogue Systems , 1999, NIPS.
[39] Yann LeCun,et al. Prediction Under Uncertainty with Error-Encoding Networks , 2017, ArXiv.
[40] Edward J. Sondik,et al. The Optimal Control of Partially Observable Markov Processes over a Finite Horizon , 1973, Oper. Res..
[41] Gabriel Kalweit,et al. Uncertainty-driven Imagination for Continuous Deep Reinforcement Learning , 2017, CoRL.