暂无分享,去创建一个
Sergey Levine | Eric Xing | Ruslan Salakhutdinov | Emilio Parisotto | Lisa Lee | Benjamin Eysenbach | S. Levine | R. Salakhutdinov | E. Xing | Benjamin Eysenbach | Lisa Lee | Emilio Parisotto
[1] Tom Schaul,et al. Unifying Count-Based Exploration and Intrinsic Motivation , 2016, NIPS.
[2] Sergey Levine,et al. Learning Robust Rewards with Adversarial Inverse Reinforcement Learning , 2017, ICLR 2017.
[3] Peter L. Bartlett,et al. RL$^2$: Fast Reinforcement Learning via Slow Reinforcement Learning , 2016, ArXiv.
[4] Andrew Y. Ng,et al. Near-Bayesian exploration in polynomial time , 2009, ICML '09.
[5] J. Robinson. AN ITERATIVE METHOD OF SOLVING A GAME , 1951, Classics in Game Theory.
[6] Nancy A. Lynch,et al. Distributed Algorithms , 1992, Lecture Notes in Computer Science.
[7] Benjamin Van Roy,et al. Coordinated Exploration in Concurrent Reinforcement Learning , 2018, ICML.
[8] Qiang Liu,et al. Learning to Explore with Meta-Policy Gradient , 2018, ICML 2018.
[9] Nuttapong Chentanez,et al. Intrinsically Motivated Reinforcement Learning , 2004, NIPS.
[10] Jürgen Schmidhuber,et al. A possibility for implementing curiosity and boredom in model-building neural controllers , 1991 .
[11] Pierre-Yves Oudeyer,et al. CURIOUS: Intrinsically Motivated Multi-Task, Multi-Goal Reinforcement Learning , 2018, ICML 2019.
[12] Stefano Ermon,et al. Generative Adversarial Imitation Learning , 2016, NIPS.
[13] Pierre-Yves Oudeyer,et al. CURIOUS: Intrinsically Motivated Modular Multi-Goal Reinforcement Learning , 2018, ICML.
[14] Sergey Levine,et al. Skew-Fit: State-Covering Self-Supervised Reinforcement Learning , 2019, ICML.
[15] O. H. Brownlee,et al. ACTIVITY ANALYSIS OF PRODUCTION AND ALLOCATION , 1952 .
[16] Sergey Levine,et al. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.
[17] Sergey Levine,et al. High-Dimensional Continuous Control Using Generalized Advantage Estimation , 2015, ICLR.
[18] Sergey Levine,et al. Reinforcement Learning and Control as Probabilistic Inference: Tutorial and Review , 2018, ArXiv.
[19] Constantinos Daskalakis,et al. A Counter-example to Karlin's Strong Conjecture for Fictitious Play , 2014, 2014 IEEE 55th Annual Symposium on Foundations of Computer Science.
[20] Marlos C. Machado,et al. Benchmarking Bonus-Based Exploration Methods on the Arcade Learning Environment , 2019, ArXiv.
[21] Marcin Andrychowicz,et al. Multi-Goal Reinforcement Learning: Challenging Robotics Environments and Request for Research , 2018, ArXiv.
[22] Henry Zhu,et al. Dexterous Manipulation with Deep Reinforcement Learning: Efficient, General, and Low-Cost , 2018, 2019 International Conference on Robotics and Automation (ICRA).
[23] Shane Legg,et al. Deep Reinforcement Learning from Human Preferences , 2017, NIPS.
[24] Radford M. Neal. Pattern Recognition and Machine Learning , 2007, Technometrics.
[25] Ruslan Salakhutdinov,et al. Concurrent Meta Reinforcement Learning , 2019, ArXiv.
[26] Vicenç Gómez,et al. Optimal control as a graphical model inference problem , 2009, Machine Learning.
[27] Shipra Agrawal,et al. Optimistic posterior sampling for reinforcement learning: worst-case regret bounds , 2022, NIPS.
[28] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[29] Tom Schaul,et al. Universal Value Function Approximators , 2015, ICML.
[30] Yasemin Altun,et al. Relative Entropy Policy Search , 2010 .
[31] Henry Zhu,et al. ROBEL: Robotics Benchmarks for Learning with Low-Cost Robots , 2019, CoRL.
[32] Siddhartha S. Srinivasa,et al. Planning-based prediction for pedestrians , 2009, 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[33] Sergey Levine,et al. Guided Cost Learning: Deep Inverse Optimal Control via Policy Optimization , 2016, ICML.
[34] Sergey Levine,et al. Diversity is All You Need: Learning Skills without a Reward Function , 2018, ICLR.
[35] Marc Toussaint,et al. Probabilistic inference for solving discrete and continuous state Markov Decision Processes , 2006, ICML.
[36] Amos J. Storkey,et al. Exploration by Random Network Distillation , 2018, ICLR.
[37] Pieter Abbeel,et al. Meta-Learning with Temporal Convolutions , 2017, ArXiv.
[38] Sergey Levine,et al. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.
[39] Sergey Levine,et al. Incentivizing Exploration In Reinforcement Learning With Deep Predictive Models , 2015, ArXiv.
[40] Sergey Levine,et al. Self-Consistent Trajectory Autoencoder: Hierarchical Reinforcement Learning with Trajectory Embeddings , 2018, ICML.
[41] Eric van Damme,et al. Non-Cooperative Games , 2000 .
[42] Pieter Abbeel,et al. Variational Option Discovery Algorithms , 2018, ArXiv.
[43] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[44] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[45] Alexei A. Efros,et al. Curiosity-Driven Exploration by Self-Supervised Prediction , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[46] J. Andrew Bagnell,et al. Modeling Purposeful Adaptive Behavior with the Principle of Maximum Causal Entropy , 2010 .
[47] Sham M. Kakade,et al. Provably Efficient Maximum Entropy Exploration , 2018, ICML.
[48] Jürgen Schmidhuber,et al. Formal Theory of Creativity, Fun, and Intrinsic Motivation (1990–2010) , 2010, IEEE Transactions on Autonomous Mental Development.
[49] Anind K. Dey,et al. Maximum Entropy Inverse Reinforcement Learning , 2008, AAAI.
[50] Leslie Pack Kaelbling,et al. Learning to Achieve Goals , 1993, IJCAI.
[51] Shane Legg,et al. Noisy Networks for Exploration , 2017, ICLR.
[52] Song-Chun Zhu,et al. Inferring "Dark Matter" and "Dark Energy" from Videos , 2013, 2013 IEEE International Conference on Computer Vision.
[53] Seif Haridi,et al. Distributed Algorithms , 1992, Lecture Notes in Computer Science.
[54] Pieter Abbeel,et al. A Simple Neural Attentive Meta-Learner , 2017, ICLR.
[55] Marc Toussaint,et al. On Stochastic Optimal Control and Reinforcement Learning by Approximate Inference , 2012, Robotics: Science and Systems.
[56] David Barber,et al. The IM algorithm: a variational approach to Information Maximization , 2003, NIPS 2003.
[57] Sergey Levine,et al. Efficient Off-Policy Meta-Reinforcement Learning via Probabilistic Context Variables , 2019, ICML.
[58] Pieter Abbeel,et al. Automatic Goal Generation for Reinforcement Learning Agents , 2017, ICML.
[59] Evangelos Theodorou,et al. Relative entropy and free energy dualities: Connections to Path Integral and KL control , 2012, 2012 IEEE 51st IEEE Conference on Decision and Control (CDC).
[60] Sergey Levine,et al. Visual Reinforcement Learning with Imagined Goals , 2018, NeurIPS.
[61] Filip De Turck,et al. #Exploration: A Study of Count-Based Exploration for Deep Reinforcement Learning , 2016, NIPS.
[62] Marcin Andrychowicz,et al. Parameter Space Noise for Exploration , 2017, ICLR.
[63] Yuval Tassa,et al. Maximum a Posteriori Policy Optimisation , 2018, ICLR.
[64] Pierre-Yves Oudeyer,et al. Intrinsic Motivation Systems for Autonomous Mental Development , 2007, IEEE Transactions on Evolutionary Computation.
[65] Sergey Levine,et al. Meta-Reinforcement Learning of Structured Exploration Strategies , 2018, NeurIPS.
[66] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[67] Filip De Turck,et al. VIME: Variational Information Maximizing Exploration , 2016, NIPS.