Simplifying Model-based RL: Learning Representations, Latent-space Models, and Policies with One Objective
暂无分享,去创建一个
[1] S. Levine,et al. INFOrmation Prioritization through EmPOWERment in Visual Model-Based RL , 2022, ICLR.
[2] Amir-massoud Farahmand,et al. Value Gradient weighted Model-Based Reinforcement Learning , 2022, International Conference on Learning Representations.
[3] Xiaolong Wang,et al. Temporal Difference Learning for Model Predictive Control , 2022, ICML.
[4] M. Maximo,et al. A Survey on Offline Reinforcement Learning: Taxonomy, Review, and Open Problems , 2022, IEEE transactions on neural networks and learning systems.
[5] Ingook Jang,et al. DreamerPro: Reconstruction-Free Model-Based Reinforcement Learning with Prototypical Representations , 2021, ICML.
[6] Alessandro Lazaric,et al. Mastering Visual Continuous Control: Improved Data-Augmented Reinforcement Learning , 2021, ICLR.
[7] Rishabh Agarwal,et al. Control-Oriented Model-Based Reinforcement Learning with Implicit Differentiation , 2021, AAAI.
[8] Provable RL with Exogenous Distractors via Multistep Inverse Dynamics , 2021, ArXiv.
[9] Sergey Levine,et al. Mismatched No More: Joint Model-Policy Optimization for Model-Based RL , 2021, ArXiv.
[10] Stefano Ermon,et al. Temporal Predictive Coding For Model-Based Planning In Latent Space , 2021, ICML.
[11] Sergey Levine,et al. Which Mutual-Information Representation Learning Objectives are Sufficient for Control? , 2021, NeurIPS.
[12] Satinder Singh,et al. Reward is enough for convex MDPs , 2021, NeurIPS.
[13] Aaron M. Dollar,et al. Model Predictive Actor-Critic: Accelerating Robot Skill Acquisition with Deep Reinforcement Learning , 2021, 2021 IEEE International Conference on Robotics and Automation (ICRA).
[14] Che Wang,et al. Randomized Ensembled Double Q-Learning: Learning Fast Without a Model , 2021, ICLR.
[15] Florian Shkurti,et al. Latent Skill Planning for Exploration and Transfer , 2020, ICLR.
[16] Li Liu,et al. A Review of Uncertainty Quantification in Deep Learning: Techniques, Applications and Challenges , 2020, Inf. Fusion.
[17] Jessica B. Hamrick,et al. On the role of planning in model-based deep reinforcement learning , 2020, ICLR.
[18] Mohammad Norouzi,et al. Mastering Atari with Discrete World Models , 2020, ICLR.
[19] Andrew Gordon Wilson,et al. On the model-based stochastic value gradient for continuous reinforcement learning , 2020, L4DC.
[20] T. Taniguchi,et al. Dreaming: Model-based Reinforcement Learning by Latent Imagination without Reconstruction , 2020, 2021 IEEE International Conference on Robotics and Automation (ICRA).
[21] S. Levine,et al. Learning Invariant Representations for Reinforcement Learning without Reconstruction , 2020, ICLR.
[22] S. Levine,et al. Off-Dynamics Reinforcement Learning: Training for Transfer with Domain Classifiers , 2020, ICLR.
[23] Satinder Singh,et al. The Value Equivalence Principle for Model-Based Reinforcement Learning , 2020, NeurIPS.
[24] S. Levine,et al. γ-Models: Generative Temporal Difference Learning for Infinite-Horizon Prediction , 2020, ArXiv.
[25] Weinan Zhang,et al. Model-based Policy Optimization with Unsupervised Model Adaptation , 2020, NeurIPS.
[26] Jackie Kay,et al. Local Search for Policy Iteration in Continuous Control , 2020, ArXiv.
[27] David Held,et al. Learning Off-Policy with Online Planning , 2020, CoRL.
[28] Erin J. Talvitie,et al. Hallucinating Value: A Pitfall of Dyna-style Planning with Imperfect Environment Models , 2020, ArXiv.
[29] S. Levine,et al. Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems , 2020, ArXiv.
[30] Pieter Abbeel,et al. Model-Augmented Actor-Critic: Backpropagating through Paths , 2020, ICLR.
[31] Vikash Kumar,et al. A Game Theoretic Framework for Model Based Reinforcement Learning , 2020, ICML.
[32] M. Ghavamzadeh,et al. Policy-Aware Model Learning for Policy Gradient Methods , 2020, ArXiv.
[33] Roberto Calandra,et al. Objective Mismatch in Model-based Reinforcement Learning , 2020, L4DC.
[34] Jimmy Ba,et al. Dream to Control: Learning Behaviors by Latent Imagination , 2019, ICLR.
[35] Demis Hassabis,et al. Mastering Atari, Go, chess and shogi by planning with a learned model , 2019, Nature.
[36] Akshay Krishnamurthy,et al. Kinematic State Abstraction and Provably Efficient Rich-Observation Reinforcement Learning , 2019, ICML.
[37] Marcello Restelli,et al. Gradient-Aware Model-based Policy Search , 2019, AAAI.
[38] Sergey Levine,et al. Stochastic Latent Actor-Critic: Deep Reinforcement Learning with a Latent Variable Model , 2019, NeurIPS.
[39] Sergey Levine,et al. Reinforcement Learning with Competitive Ensembles of Information-Constrained Primitives , 2019, ICLR.
[40] Jimmy Ba,et al. Exploring Model-based Planning with Policy Networks , 2019, ICLR.
[41] Sergey Levine,et al. Model-Based Reinforcement Learning for Atari , 2019, ICLR.
[42] Pieter Abbeel,et al. Benchmarking Model-Based Reinforcement Learning , 2019, ArXiv.
[43] Sergey Levine,et al. When to Trust Your Model: Model-Based Policy Optimization , 2019, NeurIPS.
[44] Ruben Villegas,et al. Learning Latent Dynamics for Planning from Pixels , 2018, ICML.
[45] Yuandong Tian,et al. Algorithmic Framework for Model-based Deep Reinforcement Learning with Theoretical Guarantees , 2018, ICLR.
[46] Byron Boots,et al. Differentiable MPC for End-to-end Planning and Control , 2018, NeurIPS.
[47] Jürgen Schmidhuber,et al. Recurrent World Models Facilitate Policy Evolution , 2018, NeurIPS.
[48] Honglak Lee,et al. Sample-Efficient Reinforcement Learning with Stochastic Ensemble Value Expansion , 2018, NeurIPS.
[49] Sergey Levine,et al. Deep Reinforcement Learning in a Handful of Trials using Probabilistic Dynamics Models , 2018, NeurIPS.
[50] Andrew Gordon Wilson,et al. Loss Surfaces, Mode Connectivity, and Fast Ensembling of DNNs , 2018, NeurIPS.
[51] Herke van Hoof,et al. Addressing Function Approximation Error in Actor-Critic Methods , 2018, ICML.
[52] Yuval Tassa,et al. Maximum a Posteriori Policy Optimisation , 2018, ICLR.
[53] Pieter Abbeel,et al. Model-Ensemble Trust-Region Policy Optimization , 2018, ICLR.
[54] Fabio Viola,et al. Learning and Querying Fast Generative Models for Reinforcement Learning , 2018, ArXiv.
[55] Sergey Levine,et al. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.
[56] Amir-massoud Farahmand,et al. Iterative Value-Aware Model Learning , 2018, NeurIPS.
[57] Razvan Pascanu,et al. Imagination-Augmented Agents for Deep Reinforcement Learning , 2017, NIPS.
[58] Satinder Singh,et al. Value Prediction Network , 2017, NIPS.
[59] Luca Rigazio,et al. Path Integral Networks: End-to-End Differentiable Optimal Control , 2017, ArXiv.
[60] Daniel Nikovski,et al. Value-Aware Loss Function for Model-based Reinforcement Learning , 2017, AISTATS.
[61] Sergey Levine,et al. Learning Invariant Feature Spaces to Transfer Skills with Reinforcement Learning , 2017, ICLR.
[62] MODEL-ENSEMBLE TRUST-REGION POLICY OPTI- , 2017 .
[63] Geoffrey E. Hinton,et al. Layer Normalization , 2016, ArXiv.
[64] Pieter Abbeel,et al. Value Iteration Networks , 2016, NIPS.
[65] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[66] Sergey Levine,et al. High-Dimensional Continuous Control Using Generalized Advantage Estimation , 2015, ICLR.
[67] Yuval Tassa,et al. Learning Continuous Control Policies by Stochastic Value Gradients , 2015, NIPS.
[68] Honglak Lee,et al. Action-Conditional Video Prediction using Deep Networks in Atari Games , 2015, NIPS.
[69] Alborz Geramifard,et al. Reinforcement learning with misspecified model classes , 2013, 2013 IEEE International Conference on Robotics and Automation.
[70] M. Botvinick,et al. Planning as inference , 2012, Trends in Cognitive Sciences.
[71] Vicenç Gómez,et al. Optimal control as a graphical model inference problem , 2009, Machine Learning.
[72] Carl E. Rasmussen,et al. PILCO: A Model-Based and Data-Efficient Approach to Policy Search , 2011, ICML.
[73] Yasemin Altun,et al. Relative Entropy Policy Search , 2010 .
[74] Marc Toussaint,et al. Robot trajectory optimization using approximate inference , 2009, ICML '09.
[75] Michael I. Jordan,et al. An Introduction to Variational Methods for Graphical Models , 1999, Machine Learning.
[76] Hagai Attias,et al. Planning by Probabilistic Inference , 2003, AISTATS.
[77] Sebastian Thrun,et al. Issues in Using Function Approximation for Reinforcement Learning , 1999 .
[78] Richard S. Sutton,et al. Dyna, an integrated architecture for learning, planning, and reacting , 1990, SGAR.
[79] Manfred Morari,et al. Model predictive control: Theory and practice - A survey , 1989, Autom..