PsiPhi-Learning: Reinforcement Learning with Demonstrations using Successor Features and Inverse Temporal Difference Learning
暂无分享,去创建一个
[1] Doina Precup,et al. The Option Keyboard: Combining Skills in Reinforcement Learning , 2021, NeurIPS.
[2] Sergey Levine,et al. From Language to Goals: Inverse Reinforcement Learning for Vision-Based Instruction Following , 2019, ICLR.
[3] Risto Miikkulainen,et al. Evolving explicit opponent models in game playing , 2007, GECCO '07.
[4] Alexey Dosovitskiy,et al. End-to-End Driving Via Conditional Imitation Learning , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).
[5] Sergey Levine,et al. Deep Imitative Models for Flexible Inference, Planning, and Control , 2018, ICLR.
[6] Peter Englert,et al. Multi-task policy search for robotics , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).
[7] Tom Schaul,et al. Deep Q-learning From Demonstrations , 2017, AAAI.
[8] David Warde-Farley,et al. Fast Task Inference with Variational Intrinsic Successor Features , 2019, ICLR.
[9] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[10] Anind K. Dey,et al. Maximum Entropy Inverse Reinforcement Learning , 2008, AAAI.
[11] Kee-Eung Kim,et al. Inverse Reinforcement Learning in Partially Observable Environments , 2009, IJCAI.
[12] John D. Hunter,et al. Matplotlib: A 2D Graphics Environment , 2007, Computing in Science & Engineering.
[13] Matthieu Geist,et al. A Cascaded Supervised Learning Approach to Inverse Reinforcement Learning , 2013, ECML/PKDD.
[14] Olivier Sigaud,et al. Learning compact parameterized skills with a single regression , 2013, 2013 13th IEEE-RAS International Conference on Humanoid Robots (Humanoids).
[15] Natasha Jaques,et al. Multi-agent Social Reinforcement Learning Improves Generalization , 2020, ArXiv.
[16] Marlos C. Machado,et al. Eigenoption Discovery through the Deep Successor Representation , 2017, ICLR.
[17] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[18] Olivier Pietquin,et al. Observational Learning by Reinforcement Learning , 2017, AAMAS.
[19] Martin A. Riedmiller,et al. Leveraging Demonstrations for Deep Reinforcement Learning on Robotics Problems with Sparse Rewards , 2017, ArXiv.
[20] Stefano Ermon,et al. Generative Adversarial Imitation Learning , 2016, NIPS.
[21] Tom Schaul,et al. Universal Successor Features Approximators , 2018, ICLR.
[22] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.
[23] Sergey Levine,et al. Learning Robust Rewards with Adversarial Inverse Reinforcement Learning , 2017, ICLR 2017.
[24] Richard Tanburn,et al. Making Efficient Use of Demonstrations to Solve Hard Exploration Problems , 2019, ICLR.
[25] Srivatsan Srinivasan,et al. Truly Batch Apprenticeship Learning with Deep Successor Features , 2019, IJCAI.
[26] Christos Dimitrakakis,et al. Bayesian Multitask Inverse Reinforcement Learning , 2011, EWRL.
[27] Pablo Hernandez-Leal,et al. A Survey of Learning in Multiagent Environments: Dealing with Non-Stationarity , 2017, ArXiv.
[28] Oliver Kroemer,et al. Learning to select and generalize striking movements in robot table tennis , 2012, AAAI Fall Symposium: Robots Learning Interactively from Human Teachers.
[29] K. Laland. Darwin's Unfinished Symphony: How Culture Made the Human Mind , 2017 .
[30] Anca D. Dragan,et al. SQIL: Imitation Learning via Reinforcement Learning with Sparse Rewards , 2019, ICLR.
[31] Dirk Helbing,et al. Enhanced intelligent driver model to access the impact of driving strategies on traffic capacity , 2009, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences.
[32] J. Schulman,et al. Leveraging Procedural Generation to Benchmark Reinforcement Learning , 2019, ICML.
[33] Yang Gao,et al. Reinforcement Learning from Imperfect Demonstrations , 2018, ICLR.
[34] Sebastian Tschiatschek,et al. Successor Uncertainties: exploration and uncertainty in temporal difference learning , 2018, NeurIPS.
[35] Raymond J. Dolan,et al. Game Theory of Mind , 2008, PLoS Comput. Biol..
[36] Sonia Chernova,et al. Integrating reinforcement learning with human demonstrations of varying ability , 2011, AAMAS.
[37] Sergey Levine,et al. Time-Contrastive Networks: Self-Supervised Learning from Video , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).
[38] Prabhat Nagarajan,et al. Extrapolating Beyond Suboptimal Demonstrations via Inverse Reinforcement Learning from Observations , 2019, ICML.
[39] Shane Legg,et al. IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures , 2018, ICML.
[40] Stefan Schaal,et al. Robot Learning From Demonstration , 1997, ICML.
[41] J. Henrich. The Secret of Our Success: How Culture Is Driving Human Evolution, Domesticating Our Species, and Making Us Smarter , 2015 .
[42] Sergey Levine,et al. Imitation from Observation: Learning to Imitate Behaviors from Raw Video via Context Translation , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).
[43] Andrew Y. Ng,et al. Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping , 1999, ICML.
[44] Tom Schaul,et al. Successor Features for Transfer in Reinforcement Learning , 2016, NIPS.
[45] Peter Dayan,et al. Improving Generalization for Temporal Difference Learning: The Successor Representation , 1993, Neural Computation.
[46] D. Stahl. Evolution of Smart n Players , 1991 .
[47] Marlos C. Machado,et al. Count-Based Exploration with the Successor Representation , 2018, AAAI.
[48] Sergio Gomez Colmenarejo,et al. Acme: A Research Framework for Distributed Reinforcement Learning , 2020, ArXiv.
[49] Jordan L. Boyd-Graber,et al. Opponent Modeling in Deep Reinforcement Learning , 2016, ICML.
[50] Doina Precup,et al. Fast reinforcement learning with generalized policy updates , 2020, Proceedings of the National Academy of Sciences.
[51] Shimon Whiteson,et al. Inverse Reinforcement Learning from Failure , 2016, AAMAS.
[52] Abhinav Gupta,et al. Multiple Interactions Made Easy (MIME): Large Scale Demonstrations Data for Imitation , 2018, CoRL.
[53] Pieter Abbeel,et al. Apprenticeship learning via inverse reinforcement learning , 2004, ICML.
[54] Siyuan Liu,et al. Robust Bayesian Inverse Reinforcement Learning with Sparse Behavior Noise , 2014, AAAI.
[55] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[56] Stephen P. Boyd,et al. Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.
[57] Ion Stoica,et al. Tune: A Research Platform for Distributed Model Selection and Training , 2018, ArXiv.
[58] Nando de Freitas,et al. Social Influence as Intrinsic Motivation for Multi-Agent Deep Reinforcement Learning , 2018, ICML.
[59] Sergey Levine,et al. QT-Opt: Scalable Deep Reinforcement Learning for Vision-Based Robotic Manipulation , 2018, CoRL.
[60] Brett Browning,et al. A survey of robot learning from demonstration , 2009, Robotics Auton. Syst..
[61] Sergey Levine,et al. Learning Complex Dexterous Manipulation with Deep Reinforcement Learning and Demonstrations , 2017, Robotics: Science and Systems.
[62] Ken Goldberg,et al. Deep Imitation Learning for Complex Manipulation Tasks from Virtual Reality Teleoperation , 2017, ICRA.
[63] Sergey Levine,et al. Guided Cost Learning: Deep Inverse Optimal Control via Policy Optimization , 2016, ICML.
[64] Rouhollah Rahmatizadeh,et al. Vision-Based Multi-Task Manipulation for Inexpensive Robots Using End-to-End Learning from Demonstration , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).
[65] Matthew E. Taylor,et al. Agent Modeling as Auxiliary Task for Deep Reinforcement Learning , 2019, AIIDE.
[66] Dean Pomerleau,et al. Efficient Training of Artificial Neural Networks for Autonomous Navigation , 1991, Neural Computation.
[67] Arthur C. Graesser,et al. Is it an Agent, or Just a Program?: A Taxonomy for Autonomous Agents , 1996, ATAL.
[68] Peter Stone,et al. Autonomous agents modelling other agents: A comprehensive survey and open problems , 2017, Artif. Intell..
[69] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[70] Sergio Gomez Colmenarejo,et al. One-Shot High-Fidelity Imitation: Training Large-Scale Deep Nets with RL , 2018, ArXiv.
[71] Sergey Levine,et al. Can Autonomous Vehicles Identify, Recover From, and Adapt to Distribution Shifts? , 2020, ICML.
[72] Songhwai Oh,et al. Robust Learning From Demonstrations With Mixed Qualities Using Leveraged Gaussian Processes , 2019, IEEE Transactions on Robotics.
[73] Markus Wulfmeier,et al. Maximum Entropy Deep Inverse Reinforcement Learning , 2015, 1507.04888.
[74] Pieter Abbeel,et al. Learning for control from multiple demonstrations , 2008, ICML '08.
[75] Peter Stone,et al. Behavioral Cloning from Observation , 2018, IJCAI.
[76] J. Andrew Bagnell,et al. Maximum margin planning , 2006, ICML.
[77] Sergey Levine,et al. A Connection between Generative Adversarial Networks, Inverse Reinforcement Learning, and Energy-Based Models , 2016, ArXiv.
[78] Max Jaderberg,et al. Population Based Training of Neural Networks , 2017, ArXiv.
[79] Samuel Gershman,et al. Deep Successor Reinforcement Learning , 2016, ArXiv.
[80] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[81] Andrew Y. Ng,et al. Pharmacokinetics of a novel formulation of ivermectin after administration to goats , 2000, ICML.
[82] S. Levine,et al. Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems , 2020, ArXiv.
[83] Nando de Freitas,et al. Playing hard exploration games by watching YouTube , 2018, NeurIPS.
[84] Marcin Andrychowicz,et al. Overcoming Exploration in Reinforcement Learning with Demonstrations , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).
[85] Dean Pomerleau,et al. ALVINN, an autonomous land vehicle in a neural network , 2015 .
[86] Aude Billard,et al. Donut as I do: Learning from failed demonstrations , 2011, 2011 IEEE International Conference on Robotics and Automation.
[87] Tom Heskes,et al. Solving a Huge Number of Similar Tasks: A Combination of Multi-Task Learning and a Hierarchical Bayesian Approach , 1998, ICML.