暂无分享,去创建一个
Peter Stone | Scott Niekum | Mauricio Tec | Ishan Durugkar | P. Stone | S. Niekum | Ishan Durugkar | M. Tec
[1] Nuttapong Chentanez,et al. Intrinsically Motivated Learning of Hierarchical Collections of Skills , 2004 .
[2] Huang Xiao,et al. Wasserstein Adversarial Imitation Learning , 2019, ArXiv.
[3] Michael Kearns,et al. Near-Optimal Reinforcement Learning in Polynomial Time , 2002, Machine Learning.
[4] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[5] Peter Stone,et al. Deep Reinforcement Learning in Parameterized Action Space , 2015, ICLR.
[6] A. Barto,et al. Intrinsic Motivation For Reinforcement Learning Systems , 2005 .
[7] Marcin Andrychowicz,et al. Hindsight Experience Replay , 2017, NIPS.
[8] Sergey Levine,et al. Dynamical Distance Learning for Semi-Supervised and Unsupervised Skill Discovery , 2020, ICLR.
[9] Sergey Levine,et al. C-Learning: Learning to Achieve Goals via Recursive Classification , 2020, ICLR.
[10] O. Bousquet,et al. From optimal transport to generative modeling: the VEGAN cookbook , 2017, 1705.07642.
[11] Pierre-Yves Oudeyer,et al. Intrinsically Motivated Goal Exploration Processes with Automatic Curriculum Learning , 2017, J. Mach. Learn. Res..
[12] Herke van Hoof,et al. Addressing Function Approximation Error in Actor-Critic Methods , 2018, ICML.
[13] Junhyuk Oh,et al. What Can Learned Intrinsic Rewards Capture? , 2019, ICML.
[14] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[15] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.
[16] Andrew G. Barto,et al. An intrinsic reward mechanism for efficient exploration , 2006, ICML.
[17] Michael B. Smyth,et al. Quasi Uniformities: Reconciling Domains with Metric Spaces , 1987, MFPS.
[18] Pierre-Yves Oudeyer,et al. What is Intrinsic Motivation? A Typology of Computational Approaches , 2007, Frontiers Neurorobotics.
[19] Matthieu Geist,et al. Primal Wasserstein Imitation Learning , 2020, ICLR.
[20] Satinder Singh,et al. On Learning Intrinsic Rewards for Policy Gradient Methods , 2018, NeurIPS.
[21] C. Villani. Optimal Transport: Old and New , 2008 .
[22] Aaron C. Courville,et al. Improved Training of Wasserstein GANs , 2017, NIPS.
[23] Oriol Vinyals,et al. Synthesizing Programs for Images using Reinforced Adversarial Learning , 2018, ICML.
[24] Carola-Bibiane Schönlieb,et al. Wasserstein GANs Work Because They Fail (to Approximate the Wasserstein Distance) , 2021, ArXiv.
[25] Sepp Hochreiter,et al. RUDDER: Return Decomposition for Delayed Rewards , 2018, NeurIPS.
[26] Gabriel Peyré,et al. Computational Optimal Transport , 2018, Found. Trends Mach. Learn..
[27] Marlos C. Machado,et al. Exploration in Reinforcement Learning with Deep Covering Options , 2020, ICLR.
[28] Sham M. Kakade,et al. Provably Efficient Maximum Entropy Exploration , 2018, ICML.
[29] Scott Niekum. Evolved Intrinsic Reward Functions for Reinforcement Learning , 2010, AAAI.
[30] Tianwei Ni,et al. f-IRL: Inverse Reinforcement Learning via State Marginal Matching , 2020, ArXiv.
[31] A. Barto,et al. Intrinsic motivations and open-ended development in animals, humans, and robots: an overview , 2014, Front. Psychol..
[32] Tom Schaul,et al. Unifying Count-Based Exploration and Intrinsic Motivation , 2016, NIPS.
[33] Marco Mirolli,et al. Which is the best intrinsic motivation signal for learning multiple skills? , 2013, Front. Neurorobot..
[34] Sergey Levine,et al. Reinforcement Learning with Deep Energy-Based Policies , 2017, ICML.
[35] Peter Henderson,et al. An Information-Theoretic Perspective on Credit Assignment in Reinforcement Learning , 2021, ArXiv.
[36] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[37] Léon Bottou,et al. Wasserstein Generative Adversarial Networks , 2017, ICML.
[38] Philip S. Thomas,et al. Is the Policy Gradient a Gradient? , 2019, AAMAS.
[39] Andrew Y. Ng,et al. Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping , 1999, ICML.
[40] Pieter Abbeel,et al. Automatic Curriculum Learning through Value Disagreement , 2020, NeurIPS.
[41] Peter Auer,et al. Near-optimal Regret Bounds for Reinforcement Learning , 2008, J. Mach. Learn. Res..
[42] U. Rieder,et al. Markov Decision Processes , 2010 .
[43] Gautier Stauffer,et al. The Stochastic Shortest Path Problem : A polyhedral combinatorics perspective , 2017, Eur. J. Oper. Res..
[44] J. Liao,et al. Sharpening Jensen's Inequality , 2017, The American Statistician.
[45] Richard L. Lewis,et al. Reward Design via Online Gradient Ascent , 2010, NIPS.
[46] Amos J. Storkey,et al. Exploration by Random Network Distillation , 2018, ICLR.
[47] Margaret J. Robertson,et al. Design and Analysis of Experiments , 2006, Handbook of statistics.
[48] Richard L. Lewis,et al. Intrinsically Motivated Reinforcement Learning: An Evolutionary Perspective , 2010, IEEE Transactions on Autonomous Mental Development.
[49] Tom Schaul,et al. Universal Value Function Approximators , 2015, ICML.
[50] Peter Stone,et al. Reward (Mis)design for Autonomous Driving , 2021, ArXiv.
[51] Nils J. Nilsson,et al. A Formal Basis for the Heuristic Determination of Minimum Cost Paths , 1968, IEEE Trans. Syst. Sci. Cybern..
[52] Peter Stone,et al. Generative Adversarial Imitation from Observation , 2018, ArXiv.
[53] Andrew G. Barto,et al. Intrinsic Motivation and Reinforcement Learning , 2013, Intrinsically Motivated Learning in Natural and Artificial Systems.
[54] Marc G. Bellemare,et al. The Cramer Distance as a Solution to Biased Wasserstein Gradients , 2017, ArXiv.
[55] Richard L. Lewis,et al. Internal Rewards Mitigate Agent Boundedness , 2010, ICML.
[56] Stefano Ermon,et al. Generative Adversarial Imitation Learning , 2016, NIPS.
[57] Sergey Levine,et al. Learning Robust Rewards with Adversarial Inverse Reinforcement Learning , 2017, ICLR 2017.
[58] Leslie Pack Kaelbling,et al. Learning to Achieve Goals , 1993, IJCAI.
[59] Filip Jevtić. Combinatorial Structure of Finite Metric Spaces , 2018 .
[60] Sergey Levine,et al. Replacing Rewards with Examples: Example-Based Policy Search via Recursive Classification , 2021, NeurIPS.
[61] Pieter Abbeel,et al. Goal-conditioned Imitation Learning , 2019, NeurIPS.
[62] Nuttapong Chentanez,et al. Intrinsically Motivated Reinforcement Learning , 2004, NIPS.
[63] Rich Caruana,et al. Multitask Learning , 1998, Encyclopedia of Machine Learning and Data Mining.
[64] Sergey Levine,et al. DisCo RL: Distribution-Conditioned Reinforcement Learning for General-Purpose Policies , 2021, 2021 IEEE International Conference on Robotics and Automation (ICRA).
[65] Pierre-Yves Oudeyer,et al. How can we define intrinsic motivation , 2008 .
[66] Pierre-Yves Oudeyer,et al. R-IAC: Robust Intrinsically Motivated Exploration and Active Learning , 2009, IEEE Transactions on Autonomous Mental Development.
[67] G. Baldassarre,et al. Evolving internal reinforcers for an intrinsically motivated reinforcement-learning robot , 2007, 2007 IEEE 6th International Conference on Development and Learning.
[68] Sergey Levine,et al. Efficient Exploration via State Marginal Matching , 2019, ArXiv.
[69] G. Qiu,et al. Lipschitz constrained GANs via boundedness and continuity , 2020, Neural Computing and Applications.
[70] Sergey Levine,et al. SMiRL: Surprise Minimizing Reinforcement Learning in Unstable Environments , 2021, ICLR.
[71] Richard Zemel,et al. A Divergence Minimization Perspective on Imitation Learning Methods , 2019, CoRL.
[72] Gianluca Baldassarre,et al. What are intrinsic motivations? A biological perspective , 2011, 2011 IEEE International Conference on Development and Learning (ICDL).
[73] Doina Precup,et al. Self-supervised Learning of Distance Functions for Goal-Conditioned Reinforcement Learning , 2019, ArXiv.