Regioned Episodic Reinforcement Learning
暂无分享,去创建一个
Alexander J. Smola | Rasool Fakoor | David Wipf | Jun Wang | Ming Zhou | Weinan Zhang | Yong Yu | Jiarui Jin | Cong Chen | Ming Zhou | Weinan Zhang | Jun Wang | Alex Smola | D. Wipf | Rasool Fakoor | Jiarui Jin | Yong Yu | Cong Chen
[1] Marlos C. Machado,et al. Exploration in Reinforcement Learning with Deep Covering Options , 2020, ICLR.
[2] Sergey Levine,et al. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.
[3] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[4] Kavosh Asadi,et al. Lipschitz Continuity in Model-based Reinforcement Learning , 2018, ICML.
[5] Sae-Young Chung,et al. Sample-Efficient Deep Reinforcement Learning via Episodic Backward Update , 2018, NeurIPS.
[6] Pieter Abbeel,et al. Automatic Goal Generation for Reinforcement Learning Agents , 2017, ICML.
[7] Tom Schaul,et al. FeUdal Networks for Hierarchical Reinforcement Learning , 2017, ICML.
[8] Sergey Levine,et al. Rewriting History with Inverse RL: Hindsight Inference for Policy Improvement , 2020, NeurIPS.
[9] Marlos C. Machado,et al. Eigenoption Discovery through the Deep Successor Representation , 2017, ICLR.
[10] Richard Socher,et al. Keeping Your Distance: Solving Sparse Reward Tasks Using Self-Balancing Shaped Rewards , 2019, NeurIPS.
[11] Sergey Levine,et al. Data-Efficient Hierarchical Reinforcement Learning , 2018, NeurIPS.
[12] Peter Dayan,et al. Hippocampal Contributions to Control: The Third Way , 2007, NIPS.
[13] Daphna Weinshall,et al. On The Power of Curriculum Learning in Training Deep Networks , 2019, ICML.
[14] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[15] Ion Stoica,et al. Multi-Level Discovery of Deep Options , 2017, ArXiv.
[16] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[17] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[18] Ramona O Hopkins,et al. Semantic Memory and the Human Hippocampus , 2003, Neuron.
[19] Chris Drummond,et al. Accelerating Reinforcement Learning by Composing Solutions of Automatically Identified Subtasks , 2011, J. Artif. Intell. Res..
[20] Pieter Abbeel,et al. Meta Learning Shared Hierarchies , 2017, ICLR.
[21] Alicia P. Wolfe,et al. Identifying useful subgoals in reinforcement learning by local graph partitioning , 2005, ICML.
[22] Demis Hassabis,et al. Neural Episodic Control , 2017, ICML.
[23] Pieter Abbeel,et al. Benchmarking Deep Reinforcement Learning for Continuous Control , 2016, ICML.
[24] Guangwen Yang,et al. Episodic Memory Deep Q-Networks , 2018, IJCAI.
[25] Matthew J. Salganik,et al. Experimental Study of Inequality and Unpredictability in an Artificial Cultural Market , 2006, Science.
[26] Sergey Levine,et al. Temporal Difference Models: Model-Free Deep RL for Model-Based Control , 2018, ICLR.
[27] Jason Weston,et al. Curriculum learning , 2009, ICML '09.
[28] Guangwen Yang,et al. Episodic Reinforcement Learning with Associative Memory , 2020, ICLR.
[29] Andrew G. Barto,et al. Skill Discovery in Continuous Reinforcement Learning Domains using Skill Chaining , 2009, NIPS.
[30] Joel Z. Leibo,et al. Model-Free Episodic Control , 2016, ArXiv.
[31] Tom Schaul,et al. Universal Value Function Approximators , 2015, ICML.
[32] D Marr,et al. Simple memory: a theory for archicortex. , 1971, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.
[33] Doina Precup,et al. The Option-Critic Architecture , 2016, AAAI.
[34] Charles Blundell,et al. Fast deep reinforcement learning using online adjustments from the past , 2018, NeurIPS.
[35] Richard Socher,et al. Learning World Graphs to Accelerate Hierarchical Reinforcement Learning , 2019, ArXiv.
[36] George Konidaris,et al. Option Discovery using Deep Skill Chaining , 2020, ICLR.
[37] R. Sutherland,et al. Configural association theory: The role of the hippocampal formation in learning, memory, and amnesia , 1989, Psychobiology.
[38] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .
[39] Yifan Wu,et al. The Laplacian in RL: Learning Representations with Efficient Approximations , 2018, ICLR.
[40] Sergey Levine,et al. Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction , 2019, NeurIPS.
[41] Yuan Zhou,et al. Exploration via Hindsight Goal Generation , 2019, NeurIPS.
[42] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[43] Amit K. Roy-Chowdhury,et al. Learning from Trajectories via Subgoal Discovery , 2019, NeurIPS.
[44] I. Gilboa,et al. Case-Based Decision Theory , 1995 .
[45] Marcin Andrychowicz,et al. Hindsight Experience Replay , 2017, NIPS.
[46] Marcin Andrychowicz,et al. Multi-Goal Reinforcement Learning: Challenging Robotics Environments and Request for Research , 2018, ArXiv.
[47] Daniel Guo,et al. Agent57: Outperforming the Atari Human Benchmark , 2020, ICML.
[48] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[49] Marc G. Bellemare,et al. Count-Based Exploration with Neural Density Models , 2017, ICML.
[50] Zhen Wang,et al. On the Effectiveness of Least Squares Generative Adversarial Networks , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.