暂无分享,去创建一个
Yoshua Bengio | Amanpreet Singh | Anirudh Goyal | Dhruv Batra | Devi Parikh | Nan Rosemary Ke | Ahmed Touati | Yoshua Bengio | Anirudh Goyal | Dhruv Batra | Devi Parikh | Amanpreet Singh | Ahmed Touati
[1] Ole Winther,et al. Sequential Neural Models with Stochastic Layers , 2016, NIPS.
[2] Pieter Abbeel,et al. Variational Lossy Autoencoder , 2016, ICLR.
[3] Fabio Viola,et al. Learning and Querying Fast Generative Models for Reinforcement Learning , 2018, ArXiv.
[4] Trevor Darrell,et al. Loss is its own Reward: Self-Supervision for Reinforcement Learning , 2016, ICLR.
[5] Lukás Burget,et al. Recurrent neural network based language model , 2010, INTERSPEECH.
[6] Uri Shalit,et al. Deep Kalman Filters , 2015, ArXiv.
[7] Yasemin Altun,et al. Relative Entropy Policy Search , 2010 .
[8] Yoshua Bengio,et al. A Recurrent Latent Variable Model for Sequential Data , 2015, NIPS.
[9] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[10] Honglak Lee,et al. Action-Conditional Video Prediction using Deep Networks in Atari Games , 2015, NIPS.
[11] Alexei A. Efros,et al. Curiosity-Driven Exploration by Self-Supervised Prediction , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[12] Michael L. Littman,et al. Dimension reduction and its application to model-based exploration in continuous spaces , 2010, Machine Learning.
[13] Erik Talvitie,et al. The Effect of Planning Shape on Dyna-style Planning in High-dimensional State Spaces , 2018, ArXiv.
[14] Erik Talvitie,et al. Model Regularization for Stable Sample Rollouts , 2014, UAI.
[15] Daan Wierstra,et al. Recurrent Environment Simulators , 2017, ICLR.
[16] Marc G. Bellemare,et al. The Arcade Learning Environment: An Evaluation Platform for General Agents , 2012, J. Artif. Intell. Res..
[17] Jürgen Schmidhuber,et al. Recurrent World Models Facilitate Policy Evolution , 2018, NeurIPS.
[18] Patrick van der Smagt,et al. Unsupervised Real-Time Control Through Variational Empowerment , 2017, ISRR.
[19] Filip De Turck,et al. VIME: Variational Information Maximizing Exploration , 2016, NIPS.
[20] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[21] Samy Bengio,et al. Generating Sentences from a Continuous Space , 2015, CoNLL.
[22] Stefan Schaal,et al. Robot Learning From Demonstration , 1997, ICML.
[23] Honglak Lee,et al. Sample-Efficient Reinforcement Learning with Stochastic Ensemble Value Expansion , 2018, NeurIPS.
[24] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[25] Richard S. Sutton,et al. Dyna, an integrated architecture for learning, planning, and reacting , 1990, SGAR.
[26] Razvan Pascanu,et al. Imagination-Augmented Agents for Deep Reinforcement Learning , 2017, NIPS.
[27] Satinder Singh,et al. Value Prediction Network , 2017, NIPS.
[28] Christian Osendorfer,et al. Learning Stochastic Recurrent Networks , 2014, NIPS 2014.
[29] Yoshua Bengio,et al. Professor Forcing: A New Algorithm for Training Recurrent Networks , 2016, NIPS.
[30] David Vázquez,et al. PixelVAE: A Latent Variable Model for Natural Images , 2016, ICLR.
[31] Percy Liang,et al. Generating Sentences by Editing Prototypes , 2017, TACL.
[32] Maximilian Karl,et al. Deep Variational Bayes Filters: Unsupervised Learning of State Space Models from Raw Data , 2016, ICLR.
[33] Yoshua Bengio,et al. Z-Forcing: Training Stochastic Recurrent Networks , 2017, NIPS.
[34] Dean Pomerleau,et al. Efficient Training of Artificial Neural Networks for Autonomous Navigation , 1991, Neural Computation.
[35] Alex M. Andrew,et al. Reinforcement Learning: : An Introduction , 1998 .
[36] Tom Schaul,et al. FeUdal Networks for Hierarchical Reinforcement Learning , 2017, ICML.
[37] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[38] Samy Bengio,et al. Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks , 2015, NIPS.
[39] David Q. Mayne,et al. Constrained model predictive control: Stability and optimality , 2000, Autom..
[40] Kavosh Asadi,et al. Lipschitz Continuity in Model-based Reinforcement Learning , 2018, ICML.
[41] Pieter Abbeel,et al. Prediction and Control with Temporal Segment Models , 2017, ICML.
[42] Andrew James Smith,et al. Applications of the self-organising map to reinforcement learning , 2002, Neural Networks.
[43] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[44] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[45] Doina Precup,et al. The Option-Critic Architecture , 2016, AAAI.
[46] Tom Schaul,et al. Reinforcement Learning with Unsupervised Auxiliary Tasks , 2016, ICLR.
[47] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.
[48] Sergey Levine,et al. Incentivizing Exploration In Reinforcement Learning With Deep Predictive Models , 2015, ArXiv.
[49] Sergey Levine,et al. Self-Consistent Trajectory Autoencoder: Hierarchical Reinforcement Learning with Trajectory Embeddings , 2018, ICML.
[50] Daan Wierstra,et al. Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.
[51] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[52] Carl E. Rasmussen,et al. PILCO: A Model-Based and Data-Efficient Approach to Policy Search , 2011, ICML.
[53] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[54] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.